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Abstract 

The goal of this work is to propose a finite population counterpart 
to Eigen's model, which incorporates stochastic effects. We consider 
a Moran model describing the evolution of a population of size m 
of chromosomes of length £ over an alphabet of cardinality k. The 
mutation probability per locus is q. We deal only with the sharp peak 
landscape: the replication rate is a > 1 for the master sequence and 
1 for the other sequences. We study the equilibrium distribution of 
the process in the regime where 

I — > +00 , m — > +00 , q — > , 
771 

£q -t a e]0,+oo[, -j ->■ a 6 [0, +00] . 

We obtain an equation a <f>(a) — In k in the parameter space (o, a) 
separating the regime where the equilibrium population is totally 
random from the regime where a quasispecies is formed. We observe 
the existence of a critical population size necessary for a quasispecies 
to emerge and we recover the finite population counterpart of the 
error threshold. Moreover, in the limit of very small mutations, we 
obtain a lower bound on the population size allowing the emergence 
of a quasispecies: if a < In k/ In a then the equilibrium population 
is totally random, and a quasispecies can be formed only when a > 
In k/ In cr. Finally, in the limit of very large populations, we recover 
an error catastrophe reminiscent of Eigen's model: if ae~ a < 1 then 
the equilibrium population is totally random, and a quasispecies can 
be formed only when ae~ a > 1. These results are supported by 
computer simulations. 
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1 Introduction. 



In his famous paper |12j . Eigen introduced a model for the evolution of 
a population of macromolecules. In this model, the macromolecules repli- 
cate themselves, yet the replication mechanism is subject to errors caused 
by mutations. These two basic mechanisms are described by a family of 
chemical reactions. The replication rate of a macromolecule is governed by 
its fitness. A fundamental discovery of Eigen is the existence of an error 
threshold on the sharp peak landscape. If the mutation rate exceeds a crit- 
ical value, called the error threshold, then, at equilibrium, the population 
is completely random. If the mutation rate is below the error threshold, 
then, at equilibrium, the population contains a positive fraction of the mas- 
ter sequence (the most fit macromolecule) and a cloud of mutants which 
are quite close to the master sequence. This specific distribution of indi- 
viduals is called a quasispecies. This notion has been further investigated 
by Eigen, McCaskill and Schuster [14] and it had a profound impact on 
the understanding of molecular evolution [10]. It has been argued that, 
at the population level, evolutionary processes select quasispecies rather 
than single individuals. Even more importantly, this theory is supported 
by experimental studies [11] , Specifically, it seems that some RNA viruses 
evolve with a rather high mutation rate, which is adjusted to be close to 
an error threshold. It has been suggested that this is the case for the HIV 
virus |36j . Some promising antiviral strategies consist in using mutagenic 
drugs that induce an error catastrophe [H [7] . A similar error catastrophe 
could also play a role in the development of some cancers [34] . 

Eigcn's model was initially designed to understand a population of 
macromolecules governed by a family of chemical reactions. In this set- 
ting, the number of molecules is huge, and there is a finite number of types 
of molecules. From the start, this model is formulated for an infinite pop- 
ulation and the evolution is deterministic (mathematically, it is a family of 
differential equations governing the time evolution of the densities of each 
type of macromolecule). The error threshold appears when the number of 
types goes to oo. This creates a major obstacle if one wishes to extend the 
notions of quasispecies and error threshold to genetics. Biological popula- 
tions are finite, and even if they are large so that they might be considered 
infinite in some approximate scheme, it is not coherent to consider situa- 
tions where the size of the population is much larger than the number of 
possible genotypes. Moreover, it has long been recognized that random ef- 
fects play a major role in the genetic evolution of populations [53] , yet they 
are ruled out from the start in a deterministic infinite population model. 
Therefore, it is crucial to develop a finite population counterpart to Eigcn's 
model, which incorporates stochastic effects. This problem is already dis- 
cussed by Eigen, McCaskill and Schuster [14] and more recently by Wilke 
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[55] . Numerous works have attacked this issue: Demetrius, Schuster and 
Sigmund [8], McCaskill [26], Gillespie Q2], Weinberger [38]. Nowak and 
Schuster [30] constructed a birth and death model to approximate Eigen's 
model. This birth and death model plays a key role in our analysis, as 
we shall see later. Alves and Fontanari pQ study how the error threshold 
depends on the population in a simplified model. More recently, Musso 
[27] and Dixit, Srivastava, Vishnoi [9] considered finite population mod- 
els which approximate Eigen's model when the population size goes to oo. 
These models are variants of the classical Wright-Fisher model of popu- 
lation genetics. Although this is an interesting approach, it is already a 
delicate matter to prove the convergence of these models towards Eigen's 
model. We adopt here a different strategy. Instead of trying to prove that 
some finite population model converges in some sense to Eigen's model, we 
try to prove directly in the finite model an error threshold phenomenon. 
To this end, we look for the simplest possible model, and we end up with 
a Moran model. The model we choose here is not particularly original, the 
contribution of this work is rather to show a way to analyze this kind of 
finite population models. 

We consider a population of size m of chromosomes of length £ over 
the alphabet { A,T,G,C}. The evolution of the population is governed 
by two antagonistic effects, namely mutation and replication. Mutations 
occur randomly and independently at each locus with probability q. The 
replication rate of a chromosome is given by its fitness. We consider only 
the sharp peak landscape: there is one specific sequence, called the master 
sequence, whose fitness is a > 1, and all the other sequences have fitness 
equal to 1. The mutations drive the population towards a totally random 
state, while the replication favors the master sequence. These two effects 
interact in a complicated way in the dynamics and it is extremely difficult 
to analyze precisely the time evolution of such a model. Let us focus on 
the equilibrium distribution of the process. A fundamental problem is to 
determine the law of the number of copies of the master sequence present 
in the population at equilibrium. If we keep the parameters m, £, q fixed, 
there is little hope to get useful results. In order to simplify the picture, we 
consider an adequate asymptotic regime. In Eigen's model, the population 
size is infinite from the start. The error threshold appears when £ goes to 
co and q goes to in a regime where iq — a is kept constant. We wish to 
understand the influence of the population size m, thus we use a different 
approach and we consider the following regime. We send simultaneously 
m, £ to oo and q to and we try to understand the respective influence 
of each parameter on the equilibrium law of the master sequence. By 
the ergodic theorem, the average number of copies of the master sequence 
at equilibrium is equal to the limit, as the time goes to co, of the time 
average of the number of copies of the master sequence present through 
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the whole evolution of the process. In the finite population model, the 
number of copies of the master sequence fluctuates with time. Our analysis 
of these fluctuations relies on the following heuristics. Suppose that the 
process starts with a population of size m containing exactly one master 
sequence. The master sequence is likely to invade the whole population 
and become dominant. Then the master sequence will be present in the 
population for a very long time without interruption. We call this time 
the persistence time of the master sequence. The destruction of all the 
master sequences of the population is quite unlikely, nevertheless it will 
happen and the process will eventually land in the neutral region consisting 
of the populations devoid of master sequences. The process will wander 
randomly throughout this region for a very long time. We call this time 
the discovery time of the master sequence. Because the cardinality of the 
possible genotypes is enormous, the master sequence is difficult to discover, 
nevertheless the mutations will eventually succeed and the process will start 
again with a population containing exactly one master sequence. If, on 
average, the discovery time is much larger than the persistence time, then 
the equilibrium state will be totally random, while a quasispecies will be 
formed if the persistence time is much larger than the discovery time. Let 
us illustrate this idea in a very simple model. 

1-| § , 11 

12 i + 1 £ - 1 £ 



Figure 1: Random walk example 

We consider the random walk on { 0, . . . , £ } with the transition probabilities 
depending on a parameter 9 given by: 

p(0,l) = ^, P (0,0) = l-^, p(£,£-l)=p(£,£) = ^, 

p{i,i-l)=p(i,i + l) = ^, l<i<£-l. 

The integer £ is large and the parameter 9 is small. Hence the walker 
spends its time either wandering in { 1, . . . , £ } or being trapped in 0. The 
state plays the role of the quasispecies while the set { 1, . . . , £ } plays the 
role of the neutral region. With this analogy in mind, the persistence time 
is the expected time of exit from 0, it is equal to 2/9. The discovery time 
is the expected time needed to discover starting for instance from 1, it is 
equal to 2£. The equilibrium law of the walker is the probability measure 
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/i given by 

We send £ to oo and 9 to simultaneously. If 01 goes to oo, the entropy 
factors wins and /i becomes totally random. If 61 goes to 0, the selection 
drift wins and /i converges to the Dirac mass at 0. 

In order to implement the previous heuristics, we have to estimate the 
persistence time and the discovery time of the master sequence in the Moran 
model. For the persistence time, we rely on a classical computation from 
mathematical genetics. Suppose we start with a population containing 
m — 1 copies of the master sequence and another non master sequence. 
The non master sequence is very unlikely to invade the whole population, 
yet it has a small probability to do so, called the fixation probability. If we 
neglect the mutations, standard computations yield that, in a population 
of size to, if the master sequence has a selective advantage of a, the fixation 
probability of the non master sequence is roughly of order l/er m (see for 
instance [29], section 6.3). Now the persistence time can be viewed as the 
time needed for non master sequences to invade the population. This time 
is approximately equal to the inverse of the fixation probability of the non 
master sequence, that is of order a m . For the discovery time, there is no 
miracle: before discovering the master sequence, the process is likely to 
explore a significant portion of the genotype space, hence the discovery 
time should be of order 

card { A, T, G,C} e = A 1 . 

These simple heuristics indicate that the persistence time depends on the 
selection drift, while the discovery time depends on the spatial entropy. 
Suppose that we send m,£ to oo simultaneously. If the discovery time is 
much larger than the persistence time, then the population will be neutral 
most of the time and the fraction of the master sequence at equilibrium 
will be null. If the persistence time is much larger than the discovery time, 
then the population will be invaded by the master sequence most of the 
time and the fraction of the master sequence at equilibrium will be positive. 
Thus the master sequence vanishes in the regime 

TO 

TO, I — ► +00 , — ► , 

while a quasispecies might be formed in the regime 
m, I — > +oo , — — > +oo . 

This leads to an interesting feature, namely the existence of a critical popu- 
lation size for the emergence of a quasispecies. For chromosomes of length £, 
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a quasispecies can be formed only if the population size m is such that ratio 
m/£ is large enough. In order to go further, we must put the heuristics on 
a firmer ground and we should take the mutations into account when esti- 
mating the persistence time. The main problem is to obtain finer estimates 
on the persistence and discovery times. We cannot compute explicitly the 
laws of these random times, so we will compare the Moran model with 
simpler processes. 



-. Birth and death chain 
m — l • 

of Nowak and Schuster 



1 ^ ai(m — i)e a 
m(ai + m — i) 



i 



1 * m(ai + m — i 



of (l — e a ) + i(m — i) 



'L 



Ehrcnfest walk Y n 

t—j l l 

. ^> 

3-1 3 3+1 £-1 I 



Figure 2: Approximating process 

In the non neutral populations, we shall compare the process with a birth 
and death process (Z n ) n >o on { 0, . . . , m }, which is precisely the one in- 
troduced by Nowak and Schuster [3D]. The value Z n approximates the 
number of copies of the master sequence present in the population. For 
birth and death processes, explicit formula are available and we obtain 
that, if I, m +oo, q — > 0, £q — > a G]0, +oo[, then 

persistence time ~ exp (to <p(a)) , 
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where 

<t(1 - e- a ) In a ^~ e ^ + ln(ae- a ) 
0(a) = 2^1 . 

In the neutral populations, we shall replace the process by a random walk 
on { A, T, G,C} £ = A e . The lumped version of this random walk behaves 
like an Ehrenfest process (Y n ) n >o on { 0, . . . , I } (see [5] for a nice review). 
The value Y n represents the distance of the walker to the master sequence. 
A celebrated theorem of Kac from 1947 20 , which helped to resolve a 
famous paradox of statistical mechanics, yields that, when t — > oo, 

discovery time ~ A 1 . 

Thus the Moran process is approximated by the process on 

({0,.,<}x{0}) U ({0}x {0,...,m}) 

described loosely as follows. On {0, ...,£} x {0}, the process follows 
the dynamics of the Ehrenfest urn. On {0} x {0, ...,m}, the process 
follows the dynamics of the birth and death process of Nowak and Schuster 
[50] . When in (0, 0), the process can jump to either axis. With this simple 
heuristic picture, we recover all the features of our main result. We suppose 
that 

I — » +oo , m — > +oo , q — » , 

in such a way that 

Tn 

Iq ->• a e]0,+oo[, — -> a e [0, +oo] . 

The critical curve is then defined by the equation 

discovery time ~ persistence time 

which can be rewritten as 



a<j){a) = In 4. 

This way we obtain an equation in the parameter space (a, a) separating 
the regime where the equilibrium population is totally random from the 
regime where a quasispecies is formed. We observe the existence of a crit- 
ical population size necessary for a quasispecies to emerge and we recover 
the finite population counterpart of the error threshold. Moreover, in the 
regime of very small mutations, we obtain a lower bound on the popula- 
tion size allowing the emergence of a quasispecies: if a < In 4/ In a then 



7 



the equilibrium population is totally random, and a quasispecies can be 
formed only when a > In 4/ In a. Finally, in the limit of very large pop- 
ulations, we recover an error catastrophe reminiscent of Eigen's model: if 
ae~ a < 1 then the equilibrium population is totally random, and a quasis- 
pecies can be formed only when ae~ a > 1. These results are supported by 
computer simulations. The good news is that, already for small values of 
t, the simulations are very conclusive. 




Figure 3: Simulation of the equilibrium density of the Master sequence 

It is certainly well known that the population dynamics depends on the 
population size (see the discussion of Wilke )■ In a theoretical study 
[25] . Van Nimwegen, Crutchfield and Huynen developed a model for the 
evolution of populations on neutral networks and they show that an impor- 
tant parameter is the product of the population size and the mutation rate. 
The nature of the dynamics changes radically depending on whether this 
product is small or large. Sumedha, Martin and Peliti [35] analyze further 
the influence of this parameter. In 37], Van Nimwegen and Crutchfield 
derived analytical expressions for the waiting times needed to increase the 
fitness, starting from a local optimum. Their scaling relations involve the 
population size and show the existence of two different barriers, a fitness 
barrier and an entropy barrier. Although they pursue a different goal than 
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ours, most of the heuristic ingredients explained previously are present in 
their work, and much more; they observe and discuss also the transition 
from the quasispecies regime for large populations to the disordered regime 
for small populations. The dependence on the population size and genome 
length has been investigated numerically by Elena, Wilke, Ofria and Lenski 
[15] . Here we show rigorously the existence of a critical population size for 
the sharp peak landscape in a specific asymptotic regime. The existence of 
a critical population size for the emergence of a quasispecies is a pleasing 
result: it shows that, even under the action of selection forces, a form of 
cooperation is necessary to create a quasispecies. Moreover the critical pop- 
ulation size is much smaller than the cardinality of the possible genotypes. 
In conclusion, even in the very simple framework of the Moran model on 
the sharp peak landscape, cooperation is necessary to achieve the survival 
of the master sequence. 

As emphasized by Eigen in [13] . the error threshold phenomenon is 
similar to a phase transition in statistical mechanics. Leuthausser estab- 
lished a formal correspondence between Eigen's model and an anisotropic 
Ising model [24]. Several researchers have employed tools from statistical 
mechanics to analyze models of biological evolution, and more specifically 
the error threshold: see the nice review written by Baake and Gabriel [3J. 
Baake investigated the so-called Onsager landscape in [2]. This way she 
could transfer to a biological model the famous computation of Onsager for 
the two dimensional Ising model. Saakian, Deem and Hu [32] compute the 
variance of the mean fitness in a finite population model in order to con- 
trol how it approximates the infinite population model. Deem, Muhoz and 
Park [3 1] use a field theoretic representation in order to derive analytical 
results. 

We were also very much inspired by ideas from statistical mechanics, 
but with a different flavor. We do not use exact computations, rather we 
rely on softer tools, namely coupling techniques and correlation inequali- 
ties. These are the basic tools to prove the existence of a phase transition 
in classical models, like the Ising model or percolation. We seek large de- 
viation estimates rather than precise scaling relations in our asymptotic 
regime. Of course the outcome of these techniques is very rough compared 
to exact computations, yet they are much more robust and their range of 
applicability is much wider. The model is presented in the next section 
and the main results in section [3j The remaining sections are devoted to 
the proofs. In the appendix we recall several classical results of the theory 
of finite Markov chains. 
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2 The model. 



This section is devoted to the presentation of the model. Let A be a finite 
alphabet and let k — card .4 be its cardinality. Let I > 1 be an integer. 
We consider the space A e of sequences of length I over the alphabet A. 
Elements of this space represent the chromosome of an haploid individual, 
or equivalently its genotype. In our model, all the genes have the same 
set of alleles and each letter of the alphabet A is a possible allele. Typical 
examples are A = { A, T, G, C } to model standard DNA, or A = { 0, 1 } to 
deal with binary sequences. Generic elements of A will be denoted by the 
letters u,v,w. We shall study a simple model for the evolution of a finite 
population of chromosomes on the space A 1 '. An essential feature of the 
model we consider is that the size of the population is constant throughout 
the evolution. We denote by m the size of the population. A population is 
an m-tuple of elements of A 1 . Generic populations will be denoted by the 
letters x, y, z. Thus a population a; is a vector 



whose components are chromosomes. For is {l,...,m}, we denote by 



the letters of the sequence x(i). This way a population x can be represented 
as an array 



of size Tn x £ of elements of A, the i-th line being the i-th chromosome. 
The evolution of the population will be random and it will be driven by 
two antagonistic forces: mutation and replication. 

Mutation. We assume that the mutation mechanism is the same for all the 
loci, and that mutations occur independently. Moreover we choose the most 
symmetric mutation scheme. We denote by q €]0, 1 — 1/ k[ the probability of 
the occurrence of a mutation at one particular locus. If a mutation occurs, 
then the letter is replaced randomly by another letter, chosen uniformly 
over the k—1 remaining letters. We encode this mechanism in a mutation 
matrix 




x(i,l),...,x(i,£) 






M{u,v) 



u, v e A 



e 
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where M(u, v) is the probability that the chromosome u is transformed by 
mutation into the chromosome v. The analytical formula for M(u, v) is 
then 

M(u,v) = Yl ((1 - g)l«C0=»(j3 + rZT^WJ^W) ) • 

3=1 V 7 



Replication. The replication favors the development of fit chromosomes. 
The fitness of a chromosome is encoded in a fitness function 



A: A 1 



[Or 



The fitness of a chromosome can be interpreted as its reproduction rate. 
A chromosome u gives birth at random times and the mean time interval 
between two consecutive births is \/A(u). In the context of Eigen's model, 
the quantity A(u) is the kinetic constant associated to the chemical reaction 
for the replication of a macromolecule of type u. 

Authorized changes. In our model, the only authorized changes in the 
population consist in replacing one chromosome of the population by a new 
one. The new chromosome is obtained by replicating another chromosome, 
possibly with errors. We introduce a specific notation corresponding to 
these changes. For a population x £ (A e ) , j £ { 1, . . . , m }, u £ A : , we 
denote by x(j <— u) the population x in which the j-th chromosome x{j) 
has been replaced by u: 

( *(1) \ 



x(j <- u) 



x(j - 1) 
u 

x{j + 1) 



V x{m) J 

We make this modeling choice in order to build a very simple model. This 
type of model is in fact classical in population dynamics, they are called 
Moran models [16] , 

The mutation— replication scheme. Several further choices have to be 
done to define the model precisely. We have to decide how to combine the 
mutation and the replication processes. There exist two main schemes in 
the literature. In the first scheme, mutations occur at any time of the life 
cycle and they are caused by radiations or thermal fluctuations. This leads 
to a decoupled Moran model. In the second scheme, mutations occur at the 
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same time as births and they are caused by replication errors. This is the 
case of the famous Eigcn model and it leads to the Moran model we study 
here. This Moran model can be described loosely as follows. Births occur 
at random times. The rates of birth are given by the fitness function A. 
There is at most one birth at each instant. When an individual gives 
birth, it produces an offspring through a replication process. Errors in the 
replication process induce mutations. The offspring replaces an individual 
chosen randomly in the population (with the uniform probability). 

We build next a mathematical model for the evolution of a finite pop- 
ulation of size m on the space A , driven by mutation and replication as 
described above. We will end up with a stochastic process on the popula- 
tion space (A) • Since the genetic composition of a population contains 
all the necessary information to describe its future evolution, our process 
will be Markovian. 

Discrete versus continuous time. We can either build a discrete time 
Markov chain or a continuous time Markov process. Although the math- 
ematical construction of a discrete time Markov chain is simpler, a con- 
tinuous time process seems more adequate as a model of evolution for a 
population: births, deaths and mutations can occur at any time. In ad- 
dition, the continuous time model is mathematically more appealing. We 
will build both types of models, in continuous and discrete time. Continu- 
ous time models are conveniently defined by their infinitesimal generators, 
while discrete time models are defined by their transition matrices (see the 
appendix). It should be noted, however, that the discrete time and the 
continuous time processes are linked through a standard stochastization 
procedure and they have the same stationary distribution. Therefore the 
asymptotic results we present here hold in both frameworks. 

Infinitesimal generator. The continuous time Moran model is the Mar- 
kov process (X t ) teR + having the following infinitesimal generator: for <p a 
function from {A 1 ) to R and for any x € (A e ) , 



Transition matrix. The discrete time Moran model is the Markov chain 
(X n )„ 6 N whose transition matrix is given by 



VneN Vxe(A e ) m Vje{l,...,t} Vu e A 1 \{x(j)} 
P(X n+1 = x(j ^u)\X n = x) = — !— A(x(i))M(x(i), u) , 




]T A(x(i))M(x(i), u) (j>(x(j <- u)) - <j>{xj) . 



l<i,j<m u£A e 



l<i<m 
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where A > is a constant such that 



A > max { A(u) : u E A' 



e 



}■ 



The other non diagonal coefficients of the transition matrix are zero. The 
diagonal terms are chosen so that the sum of each line is equal to one. 
Notice that the continuous time formulation is more concise and elegant: 
it does not require the knowledge of the maximum of the fitness function 
A in its definition. 

Loose description of the dynamics. We explain first the discrete time 
dynamics of the Markov chain (X n ) ne ^. Suppose that X n = x for some 
n e N and let us describe loosely the transition mechanism to X n+ \ = y. 
An index i in { 1, . . . , m } is selected randomly with the uniform probabil- 
ity. With probability 1 — A(x(i))/X, nothing happens and y = x. With 
probability A(x(i))/X, the chromosome x(i) enters the replication process 
and it produces an offspring u according to the law M{x(i),-) given by 
the mutation matrix. Another index j is selected randomly with uniform 
probability in { 1, . . . , m }. The population y is obtained by replacing the 
chromosome x(j) in the population a; by a chromosome u. 

We consider next the continuous time dynamics of the Markov pro- 
cess (X t ) teWL +. The dynamics is governed by a clock that rings randomly. 
The time interval t between each of the clock ringing is exponentially dis- 
tributed with parameter m 2 \: 



Suppose that the clock rings at time t and that the process was in state x 
just before the time t. The population x is transformed into the population 
y following the same scheme as for the discrete time Markov chain (X n ) ne ^ 
described previously. At time t, the process jumps to the state y. 



Vf e M + P(t >t)= cxp ( 



m 2 Xt) . 
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3 Main results. 



This section is devoted to the presentation of the main results. 

Convention. The results hold for both the discrete time and the con- 
tinuous time models, so we do not make separate statements. The time 
variable is denoted by t throughout this section, it is cither discrete with 
values in N or continuous with values in M + . 

Sharp peak landscape. We will consider only the sharp peak landscape 
defined as follows. We fix a specific sequence, denoted by w* , called the 
wild type or the master sequence. Let a > 1 be a fixed real number. The 
fitness function A is given by 



Vw, e A 1 A(u) 



1 if u ^ w* 
a \iu = w* 



Density of the master sequence. We denote by N(x) the number of 
copies of the master sequence w* present in the population x: 

N(x) — card { i : 1 < i < m, x(i) = w* } . 

We are interested in the expected density of the master sequence in the 
steady state distribution of the process, that is, 

Master(cr, ^, to, q) = lim e(— N(X t j) , 

t^cc \TO / 

as well as the variance 

Variance(a, £, to, q) — lim E\ (— N(X t ) — Masterfa, £, m, q) 

t-s-oo \\m 

The limits exist because the transition mechanism of the Markov process 
(X t )t>o is irreducible (and aperiodic for the discrete time case) as soon as 
the mutation probability is strictly between and 1. Since the state space 
is finite, the Markov process (Xt)t>a admits a unique invariant probability 
measure, which describes the steady state of the process. The ergodic the- 
orem for Markov chains implies that the law of (X t )t>o converges towards 
this invariant probability measure, hence the above expectations converge. 
The limits depend on the parameters of the model, that is a, I, to, q. Our 
choices for the infinitesimal generator and the matrix transition imply that 
the discrete time version and the continuous time version have exactly the 
same invariant probability measure. In order to exhibit a sharp transition 
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phenomenon, we send £, m to oo and q to 0. Let 
the function defined by 



U { +00 } be 



Va < In a 4>{o) 
and 4>(a) = if o > In a. 



<r(l - e- a ) In ^ ?—± + \n(ae- a ) 



a - 1 



(l-a(l-e- a )) 



100 




Critical curve / a <|>(a)= In k 
for / o=2, k=4 



Disorder 



0.2 0.3 0.4 0.5 

Figure 4: Critical curve 
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Theorem 3.1 We suppose that 

£ — > +00 , m — > +00 . 

in such a way that 

£q-> a e]0, +oo[ , 

We have the following dichotomy: 

• If a 4>{a) < In n then Master (a, £, m, g) 

• If a 4>{a) > In n then Master (a, £, m, (7) 
In both cases, we have Variance(<7, £, m, q) 



rn 

T 



e [0,- 
0. 



0. 
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Figure 5: Master sequence at equilibrium 



These results are supported by computer simulations (see figure [5]). On 
the simulations, which are of course done for small values of £, the tran- 
sition associated to the critical population size seems even sharper than 
the transition associated to the error threshold. The programs are written 
in C with the help of the GNU scientific library and the graphical output 
is generated with the help of the Gnuplot program. To increase the effi- 
ciency of the simulations, we simulated the occupancy process obtained by 
lumping the original Moran model. The number of generations in a simu- 
lation run was adjusted empirically in order to stabilize the output within 
a reasonable amount of time. Twenty years ago, Nowak and Schuster could 
perform simulations with I = 10 and m = 100 for 20 000 generations [3D]. 
Today's computer powers allow to simulate easily models with I = 20 and 
m = 100 for 10 000 000 000 generations. The good news is that, already 
for small values of £, the simulations are very conclusive. Figure |5] presents 
three pictures corresponding to simulations with I = 4, 8, 16, as well as the 
theoretical shape for I = oo in the last picture. Notice that the statement 
of the theorem holds also in the case where a is null or infinite. This yields 
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Iength=4 length=8 




Figure 6: Varying the length £ 



the following results: 

Small populations. If £, m — > +00, q — > 0, iq — > a €]0, +oo[, ^ — » 0, 
then Master(tr, £, m, q) — » 0. 
Large populations. Suppose that 

ttl 

i?,m->+oo, q — > 0, £q — > a e]0, +oo[, — ^+00. 

If a > lntr, then Master(a, £, m, q) — > 0. If a < lner, then 

Master (a, £, m, q) — > — -. 

x ' a — 1 

Interestingly, the large population regime is reminiscent of Eigen's model. 
A slightly more restrictive formulation consists in sending £ to 00, m to 00 
and q to in such a way that m/£ and £q are kept constant. We might 
then take q and m as functions of £. Let a, a G]0, +00 [. We take q = a/£ 
and m — a£ and we have 



lim Master(cr, a£, all) = 

t—yoo 




if a 4>{a) < In k 
if a 4>{a) > In k 
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Notice that at(f){a) > InK implies that a < In a and ere a > 1. The critical 
curve 

{ (a, a) G M+ x K+ : a <j>(a) = In k } 

corresponds to parameters (a, a) which are exactly at the error threshold 
and the critical population size. We are able to compute explicitly the 
critical curve and the limiting density because we consider a toy model. We 
did not examine here what happens on the critical curve. It is expected 
that the limiting density of the master sequence still fluctuates so that 
Variance(er, £, a£, a/t) does not converge to whenever a 4>(a) = InK. An 
important observation is that the critical scaling should be the same for 
similar Moran models. In contrast, the critical curve seems to depend 
strongly on the specific dynamics of the model. However, in the limit 
where a goes to 0, the function 4>{a) converges towards Inc. This yields 
the minimal population size allowing the emergence of a quasispecies. 
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Figure 7: Critical population size 

Corollary 3.2 If a < In k/ In a then 

Va > lim Master (a, £, at, a/t) = 0. 

I— foo 
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Figure 8: Error threshold 



3a > lim Master (a,£,a£,a/£) > 0. 

£— s-oo 

We can also compute the maximal mutation rate permitting the emergence 
of a quasispecies. Interestingly, this maximal mutation rate is reminiscent 
of the error catastrophe in Eigen's model. 



Corollary 3.3 If a > lner then 



Va > lim Master (a, £, at, all) = . 

t—¥<X> 



If a < In a then 

3a > lim Master (a, £, at, all) > 0. 

In conclusion, on the sharp peak landscape, a quasispecies can emerge only 
if 

InK lncr 
m > £ , q < 



In a 
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The heuristic ideas behind theorem 13 . 1 1 were explained in the introduction. 
These heuristics are quite simple, however, the corresponding proofs are 
rather delicate and technical. There is very little hope to do a proof entirely 
based on exact computations. Our strategy consists in comparing the orig- 
inal Moran process with simpler processes in order to derive adequate lower 
and upper bounds. To this end, we couple the various processes starting 
with different initial conditions (section [4]). Unfortunately, the natural cou- 
pling for the Moran model we wish to study is not monotone. Therefore we 
consider an almost equivalent model, which we call the normalized Moran 
model. This model is obtained by normalizing the reproduction rates so 
that the total reproduction rate of any population is one (section [5]). We 
first observe that the Moran model is exchangeable (section [6]) . However, 
the initial state space of the Moran process has no order structure and it 
is huge. We use a classical technique, called lumping, in order to reduce 
the state space (section [7]). This way we obtain two lumped processes: the 
distance process (D t )t>o which records the Hamming distances between 
the chromosomes of the population and the Master sequence and the oc- 
cupancy process (O t )t>o which records the distribution of these Hamming 
distances. The distance process is monotone in the neutral case a = 1, 
while the occupancy process is monotone for any value a > 1 (section [5} . 
Therefore we construct lower and upper processes to bound the occupancy 
process (section^. These processes have the same dynamics as the original 
process in the neutral region and they evolve as a birth and death process 
as soon as the population contains a master sequence. We use then the 
ergodic theorem for Markov chains and a renewal argument to estimate 
the invariant probability measures of these processes. The behavior of the 
lower and upper bounds depends mainly on the persistence time and the 
discovery time of the master sequence. We rely on the explicit formulas 
available for birth and death processes to estimate the persistence time 
(section [TU)) . To estimate the discovery time, we rely on rough estimates 
for the mutation dynamics and correlation inequalities (section 1111) . The 
mutation dynamics is quite similar to the Ehrcnfcst urn, however it is more 
complicated because several mutations can occur simultaneously and exact 
formulas are not available. The proof is concluded in section [T"2l 
Warning. From section [6] onwards, we work with the normalized Moran 
model defined in section [5] This model is denoted by (Xt)t>o and its 
transition matrix by p, like the initial Moran model. We deal only with 
discrete time processes in the proofs. The time is denoted by t or n. 
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4 Coupling 



The definition of the processes through infinitesimal generator is not very 
intuitive at first sight. We will provide here a direct construction of the 
processes, which does not make appeal to a general existence result. This 
construction is standard and it is the formal counterpart of the loose de- 
scription of the dynamics given in section [2] Moreover it provides a use- 
ful coupling of the processes with different initial conditions and different 
control parameters a, q. All the processes will be built on a single large 
probability space. We consider a probability space (f2,.F, P) containing 
the following collection of independent random variables: 

• a Poisson process (r(i))t>o with intensity m 2 X. 

• two sequences of random variables I n , J ni n > 1 , with uniform law on 
the index set { 1, . . . , I }. 

• a family of random variables U n j, n > 1, 1 < I < £, with uniform law on 
the interval [0,1]. 

• a sequence of random variables S n , n > 1, with uniform law on the 
interval [0, 1]. 

We denote by r n the n-th arrival time of the Poisson process (r(i)) t >o, i.e., 

Vn > 1 t„ = inf { t > : r(i) = n } . 

The random variables I n , J n , U n .i, 1 < I < I, and S n will be used to decide 
which move occurs at time r n . To build the coupling, it is more convenient 
to replace the mutation probability q by the parameter p given by 

K 



We define a Markov chain (X„)„ e pj with the help of the previous random 
ingredients, whose law is the law of the Moran model. The process starts 
at time from an arbitrary population xq- Let n > 1, suppose that the 
process has been defined up to time n — 1 and that X n —i — x. We explain 
how to build X n — y. Let us set i = I n . If S n > A(x(i))/X, then y = x. 
Suppose next that S n < A(x(i))/X. We define y as follows. We index the 
elements of the alphabet A in an arbitrary way: 

A = { a-L, . . . ,a K } . 
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Let j — J n . We set 



if 




VIe{l,..,n 



y(j,i) = < 



if 



(r-iy-<U n ,<r^ 



if 



(k- 1)- < <p 



K 



x(i, I) if 



^rU > P 



For k ^ j we set y(fc) = Finally we define X n = y. 

We define also a Markov process {X t ) teR + with right continuous trajec- 
tories. The process starts at time from an arbitrary population xq and it 
moves only when there is an arrival in the Poisson process (r(i))t>o- Let 
t > and suppose that r„ = t for some n > 1. Suppose that just before t 
the process was in state x: 



We proceed as in the construction of the discrete time process at step n to 
build the new population y starting from x and we set X t — y. Therefore 
we have 



lim X s — x . 



s<t 



Vn > Vt e [t„,t„ +1 [ X t = X, 
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5 Normalized model 



The Moran model defined previously is difficult to analyze for several rea- 
sons. A major problem is that the natural coupling constructed in section|4] 
is not monotone. We define next a related Moran model which is simpler 
to study. This model is obtained by normalizing the reproduction rates so 
that the total reproduction rate of any population is one. The continuous 
time normalized Moran model is the Markov process (A t ) teK + whose in- 
finitesimal generator L is defined as follows: for (f> a function from (A 1 ) 
to M. and for any x £ (A e ) , 

j if \ A(x(i))M(x(i),u) ( , . \ 

l<i,j<mu£A e y X " V " 

The discrete time normalized Moran model is the Markov chain (X n ) n ^ 
with transition matrix p given by 

V^(i') m Vje{i,...,n Vue^\{x(j)} 

1 ^ A(x{i))M(x{i),u) 



p(x,x(j <-«)) = — V 



m 1 i^ m A ( x ( 1 ))+---+ A ( x ( m )) ' 

The other non diagonal coefficients of the transition matrix are zero. In the 
remaining of the paper, we shall work with this Markov chain (X n ) n£ N and 
the transition matrix p. We shall prove the main theorem 13.11 of section [3] 
for this process. In fact, we shall even prove the following stronger result. 
Let v be the image of the invariant probability measure of (X n ) n >o through 
the map 

xe (A") m ^-N(x) e [0,1]. 

m 

The probability measure v is a measure on the interval [0, 1] describing the 
equilibrium density of the master sequence in the population. Indeed, 



Vie{0,...,m} v[— ) = lim P(N(X n )=i). 



I 

The probability v depends on the parameters a, £, m, q of the model. Let 
4>(a) be the function defined before theorem |3. 11 i.e., 

<j(l - e- a ) In a ^~ e + ln(ae- a ) 
V« < In. 0(a) = (1 _ ff ( i : 1 0) 

and 4>(a) = if a > In a. Let 

P* = 
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Theorem 5.1 We suppose that 

I — » +00 , m — > +00 , q — » , 

in such a way that 

£g ->• a 6]0,+oo[, — -» a e [0, +00] . 
We have the following dichotomy: 

• If a<p(a) < ln/c then v converges towards the Dirac mass at 0: 

Ve > K[M) -> !• 

• If a(j)(a) > ln/c then converges towards the Dirac mass at p*: 

Ve > z/([p* -e,p*+ e ]) -> 1. 

We shall prove this theorem for the normalized Moran model (X n ) n ^. Let 
us show how this implies theorem l3.1l for the initial model. In the remainder 
of this argument, we denote by (X4)neN the Moran model described in 
section [5] and by p' its transition matrix. The transition matrices p and p' 
are related by the simple relation 

Vx,y£(A e ) m , x^y, p'{x,y) = (3(x)p(x,y) 

where 

VxeUT f3(x) = -±-(A(x{l))+--- + A(x(m))) . 

mX 

Let /i and \j! be the invariant probability measures of the processes (X t )t>o 
and (X t ')t>o- The probability /1 is the unique solution of the system of 
equations 

We rewrite these equations as: 

Vx e (A e ) m n(x) p( x ,v) = Ky)p{y,x)- 

ye(A l r ve(A e r 

y^x y^x 

Replacing p by p' , we get 

y£(A e ) m y€(A e ) m 

y^x y^x 
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Using the uniqueness of the invariant probability measure associated to p' , 
we conclude that 



Vx e (^ £ ) m - P{x) 



E 



My) ' 

In the case of the sharp peak landscape, the function /3(x) can be rewritten 

as 

Vx G (A e ) m (3{x) = -i-((tr - l)JV(as) + m) . 
Let us denote by v and 1/ the images of fx and // through the map 

xe (A e ) m ^-N(x) G [0,1]. 

m 

We can thus rewrite 

\ - /%) \ - A/x(y) _ /" \dv{t) 

^ P(y) ^ , ^N(y) L,(tr-l)t + r 

yeMO" 1 ye(^ f ) m (cr- 1)— — + 1 [0, 1 

TO 

For any function / : [0, 1] — ► K, we have then 



/ fdu' = limE(f(^N(Xi))) = Yl f(-N(xj)vt(x) 
x£(^)» m ^ yro.il + l 



E 



_ _W] 

M(y) /" Adi/(i) 



y M] (a-i)<+i 

We suppose that 

£ —> +oo , m — > +oo , g->0, 

in such a way that 

to r . 

£g a e]Q,+oo[, — ->• a G [0, +oo] . 

By theorem 15. 11 away from the critical curve a<j){a) — Iiik, the probability 
v converges towards a Dirac mass. If v converges towards a Dirac mass at 
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p, then we conclude from the above formula that v' converges towards the 
same Dirac mass and 

Master(er, £, m, q) —> p, 
Variance(cr, £, m, q) — > 0. 

This way we obtain the statements of theorem 13. II From now onwards, in 
the proofs, we work exclusively with the normalized Moran process, and 
we denote it by (X t )t>o- 
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6 Exchangeability 

The symmetric group & m of the permutations of {1, ...,m} acts in a 
natural way on the populations through the following group operation: 

V* G (A e ) m Vp G e m Vj G { 1, . . . , m } (p ■ x)(j) = x(p(j)) . 

A probability measure p on (A e ) m is exchangeable if it is invariant under 
the action of & m : 

Vp G & m Vx G (A e ) m fJ,{fi ■ x) = fl(x) . 

A process (X t )t>o with values in (A e ) m is exchangeable if and only if, for 
any t > 0, the law of X t is exchangeable. 

Lemma 6.1 The transition matrix p is invariant under the action of & m : 

Vxe(A e ) m VpG6 m Vje{l,...,m} Vit G A* \ { x(J) } 

p(p • x, p • (a;(j «- u))) = p(x, i(j <- u)) . 
Proof. Let x, p,j, u be as in the statement of the lemma. We have 

p(p-x,p- (x(j <- u))) = p(p-x, (p ■ x)(p' 1 (j) <- u)) 
= 1 V A((p-x)(i))M((p-x)(i),u) 

_ i V- A(x(i))M(x(i),u) , > 

~m A(a:(l)) + ...+A(a;(m)) ~ Pl ' U Jj ' 

l<i<m 

Thus the matrix p satisfies the required invariance property. □ 

Corollary 6.2 Let p be an exchangeable probability distribution on the 
population space (A £ ) . The Moran model (X t ) t > starting with p as the 
initial distribution is exchangeable. 

Proof. Let p G & m and let / be a function from (A?) to M. Using the 
exchangeability of p and lemma 16.11 we have, for any t > 1, 

X! MP ' :E o)p(p • zo,P • Xi) • • -p(p • x t _i,p • a; f ) /(p • X t ) 

x ,— ,x t E(A e ) m 

= ^2 p(xo)p(x Q ,x 1 ) ■ ■ -p(xt-i,x t ) f(X t ) = E(f(X t )) ■ 

Xo,- ,x t £(A e ) m 

Thus the process (Xt)t>o is exchangeable. □ 
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7 Lumping 



The state space of the process (X t )t>o is huge, it has cardinality n tm . We 
will rely on a classical technique to reduce the state space called lumping 
(see the appendix). We consider here only the sharp peak landscape. In 
this situation, the fitness of a chromosome is a function of its distance to 
the master sequence. A close look at the mutation mechanism reveals that 
chromosomes which are at the same distance from the Master sequence are 
equivalent for the dynamics, hence they can be lumped together in order 
to build a simpler process on a reduced space. For simplicity, we consider 
only the discrete time process. However similar results hold in continuous 
time. 

7.1 Distance process 

We denote by dn the Hamming distance between two chromosomes: 

Vu, we/ d H (u, v) = card { j : 1 < j < £, u(j) ^ v(j) } . 

We will keep track of the distances of the chromosomes to the master 
sequence w* . We define a function H : A 1 — > { 0, . . . , I } by setting 

VueA e H{u) = d H (u,w*). 

The map H induces a partition of A e into Hamming classes 

H-\{b}), be{0,. ..,£}. 

We prove first that the mutation matrix is lumpablc with respect to the 
function H . 

Lemma 7.1 (Lumped mutation matrix) Let 6, c e { 0, . . . , t } and let 

u e A e such that H (u) = b. The sum 

w£A e 
H(w)=c 

does not depend on u in _ff^ 1 ({ &}), it is a function of b and c only, which 
we denote by M//(6,c). The coefficient Mnib.c) is equal to 

Jcjr)C)M>4))VK>4))nDVr' 

0<l<b 
k—l—c—b 
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Proof. Let b, c G { 0, . . . , t } and let u G A e such that H(u) = b. We will 
compute the law of H(w) whenever w follows the law M(u, •) given by the 
line of M associated to u. For any w G A 1 , we have 

H ( w ) = 

l<l<i 

l<l<i 

= H(u)+ ^2 (lw(l)^w*(l),u(l)=w*(l) ~ ^w(l)=W(l),u(l)jtW(l)^ ■ 



i<i<e 

According to the mutation kernel M, for indices I such that u(l) = w*(l), 
the variable ^ w (i)^w*(i) is Bernoulli with parameter p{\ — while for 

indices I such that u(l) ^ w*(l), the variable l w (l)=w*(l) is Bernoulli with 
parameter p/n. Moreover these Bernoulli variables are independent. Thus 
the law of H(w) under the kernel M(u,w) is given by 

H{u) + Binomial(^ — H(u),p(l — 1/k)) — Binomial(_ff(u),p/«;) 

where the two binomial random variables are independent. This law de- 
pends only on H(u), therefore the sum 

w£A e 
H(w)=c 

is a function of b = H(u) and c = H{w) only, which we denote by Mn(b, c). 
The formula for the lumped matrix Mh is obtained by computing the law 
of the difference of the two independent binomial laws appearing above. □ 

The fitness function A of the sharp peak landscape can be factorized 
through H. If we define 



vbG{o,...,n A H {b) = 

then we have 



a if b = 
1 if b > 1 



Vu G A 1 A(u) = A H (H(u)). 
We define further a vector function H : [A l ) m — > { 0, . . . , t } m by setting 

/x(l)\ (H(x(l))\ 
Vx = : G (A')" 1 M(x) = : 

\x(m)J \H(x(m))J 
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The partition of (A e ) m induced by the map H is 

M-\{d}), de{0,...,£} m . 

We define finally the distance process (D t )t>o by 

Vt > A = M(X t ) . 

Our next goal is to prove that the process (X t )t>o is lumpable with respect 
to the partition of (A e ) induced by the map H, so that the distance 
process (D t )t>o is a genuine Markov process. 

Proposition 7.2 (H Lumpability) Let p be the transition matrix of the 
Moran model. We have 



Vee{0,...,^} m Vx,ye(A e y 
M(x) = M(y) => 



H(z)=e 



H(z)=e 



Proof. For the process (X t )t>o, the only transitions having positive 
probability are the transitions of the form 

x — ► x(j 4— u) , 1 < j < m, u e A 1 . 

Let e e {0, ...,£} m and let x,y e (A 1 )™ be such that M(x) = M(y). 
We set d = H(x) = H(y). If the vectors d, e differ for more than two 
components, then the sums appearing in the statement of the proposition 
are equal to zero. Suppose first that the vectors d, e differ in exactly one 
component, so that there exist j <G { 1, . . . , m } and c e { 0, . . . , I } such 
that e = d(j <— c) and d(j) ^ c. Naturally, d(j 4— c) is the vector d in 
which the j-th component d(j) has been replaced by c: 

/ d(l) \ 

d(j 1) 

c 

d(j + 1) 
V d(m) j 



d(j <- c) 



We have then 



E 

H(z)=e 



_ff(tu)=c 
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Using lemma ETJ we have 

I c, \\ 1 A(x(i))M(x(i),w) 



m ^ A(x(l)) H hi i m 

H{w)=c H(w)=c 

A H (H(x(i)))M H (H(x(i)),c) 



m i£lm Ah{H{x{1))) +■■■+ A H (H(x(m))) ' 

This sum is a function of M(x) and c only. Since H(x) = H(y), the sums 
are the same for x and y. Suppose next that d = e. Then 

^2 P(x,z) = p(x,x) + ^2 X! p(i,i(j^w)) 

z£(A e ) m l<j<m w eA e \{x(J)} 

H(»=e H(w)=H(x(])) 

= 1- ^ p(x, X(j 4- W)) + ^ P{x,x{j 4- w)) 

l<J<ni l<J<m tue^\{ } 

iue^\{xO)} _f/(to)=ff(x(j)) 

= 1-53 X! p(x,x(J<-w)) 

l<j<m we A e \{ x (j) } 
H{w)^H(x(j)) 

= !- S 51 E p(s,s(j <-«>)) • 
i<j<m ce{o,...,f} 

c^i?(s(i)) H(i«)= c 

We have seen in the previous case that the last sum is a function of H(ir) 
and c only. The second sum as well depends only on H(x). Therefore the 
above quantity is the same for x and y. □ 



We apply the classical lumping result (see theorem IA.3I) to conclude that 
the distance process (£*t)t>o is a Markov process. From the previous com- 
putations, we see that its transition matrix pn is given by 

We{0,...,n m Vje{l,...,m} Vce{0 J ...,*}\{d0")} 

^ 1 A H (d(i))M g (d(z),c) 

MM0<-c)) = - ^ ^ (d(1)) + ... + Afl(d(m)) - 

l<z<m 

7.2 Occupancy process 

We denote by VjR-y the set of the ordered partitions of the integer m in at 
most £ + 1 parts: 

= { (o(0), ■ • ■ , o(0) G N e+1 : o(0) + • • ■ + o(*) = m } . 
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These partitions are interpreted as occupancy distributions. The partition 
(o(0), . . . , o(£)) corresponds to a population in which o(Z) chromosomes are 
at Hamming distance I from the master sequence, for any I <G { 0, . . . , I }. 
Let O be the map which associates to each population x its occupancy 
distribution 0{x) = (o(x, 0), . . . , o(x, £)), defined by: 

VZ e {0,...,£} o(x,l) = cardji : 1 < i < m, d H (x(i),w*) = 1} . 

The map O can be factorized through H. For d e { 0, . . . , I } m , we set 

On(d, I) — card { i : 1 < i < to, d(i) = I } 

and we define a map Or '■ { 0, . . . , t } m —¥ V^\ x by setting 

H (d) = {o H (d,0),...,o H (d,e)). 

We have then 

Vxe (A e ) m O(x) - O h (U(x)) . 

The map O lumps together populations which are permutations of each 
other: 

Vxe(A e ) m V P e6 m 0(x)=0(p-x). 
We define the occupancy process (O t ) t >o by setting 

Vt>0 O t = 0(X t ) = H (D t ). 

For the process (D t ) t >o, the only transitions having positive probability 
are the transitions of the form 

d — > d(j <r- c) , 1 < j < m, c e { 0, . . . , I } . 

Therefore the only possible transitions for the process (O t )t>o are 

o — > o(k^l), 0<k,l<£, 

where o(k — > I) is the partition obtained by moving a chromosome from 
the class k to the class /, i.e., 

(o(h) iih^k,l 
Vhe {0,...,l} o(k -> l)(h) = I o(k) - 1 if h = k 

[o(l) + l iih = l 
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Proposition 7.3 (O Lumpability) Let pn be the transition matrix of 
the distance process. We have 



VoeP™! Vd.ee {0,...,l} m , 

H (d)=0 H (e) => ^ PH(d,f)= Pn(eJ)' 

/6{o,...,n m /e{o : ...,n m 

OhU)=o OhU)=o 

Proof. Let o € 7^ and d, e £ { 0, . . . ,£ } m such that H (d) = H {e). 
Since On(d) = 0jj(e), then there exists a permutation p € 6 m such that 
p ■ d = e. By lemma RTT1 the transition matrices p and p# are invariant 
under the action of & m , therefore 

X PH(d,f)= PH(p-d,p-f) 
/c{(> < ! /e{o,...,n m 

OhU)=o OhU)=o 



Pn(e,f) = Ph(cJ) 
fe{o,...,i} m fe{o,....i} m 

Off(p- 1 -/)=° OhU)=o 

as requested. □ 



We apply the classical lumping result (see theorem IA.3[) to conclude that 
the occupancy process (Ot)t>o is a Markov process. Let us compute its 
transition probabilities. Let o € Vt+x an d d € { 0, . . . , £ } rn be such that 
On(d) o. Let us consider the sum 

E Pir(d,/). 
/e{o,...,n m 

Oh(/)=o 

The terms in the sum vanish unless 

3je{l,...,m} 3ce{0,...,n, c^d(j), f = d(j<-c). 
Suppose that it is the case. If in addition / is such that Oh(J) — o, then 

o = H (d)(d(j) -> c) . 
Setting fc = and I = c, we conclude that 

3/M € {0,...,£} o = O s (d)(k -> I) . 
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The two indices k, I satisfying the above condition are distinct and unique. 
We have then 



£ p H (d,f)= PH{d,d{ 3 ^l)) 



fe{a,....£} m je{i,...,m} 

OhU)=o d(j)=k 



1 ^ A H {d{i))M H (d{i),l) 

} 

d(j)=k 



• c /? x m i5 A H (d(l)) + ---+A H (d(m)) 

je{ l,...,m } l<i<m 



0<h<£ 



2 



rn 

o<h<e 



This fraction is a function of On(d), k and /, thus it depends only on 
On(d) and o as requested. We conclude that the transition matrix of the 
occupancy process is given by 

VoePftj Vfc,ie{0,. ..,*}, fc^/, 

£ 

o{k)Y,o{h) A H (K) M H {h,l) 



po{o,o(k 0) 



h=0 



/i=0 



7.3 Invariant probability measures 

There are several advantages in working with the lumped processes. The 
main advantage is that the state space is considerably smaller. For the 
process (X t )t>o, the cardinality of the state space is 

card(^) m - n em . 

For the distance process (D t ) t > , it becomes 

card{0,...,^} m = (£+l) m . 

Finally for the occupancy process, the cardinality is the number of ordered 
partitions of m into at most £+1 parts. This number is quite complicated 
to compute, but in any case 

card 7^ < (1+ l) m . 
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Our goal is to estimate the law v of the fraction of the master sequence in 
the population at equilibrium. The probability measure v is the probability 
measure on the interval [0, 1] satisfying the following identities. For any 
function / : [0, 1] -> R, 

where fi is the invariant probability measure of the process (X t )t>o an d 
N(x) is the number of copies of the master sequence w* present in the 
population x: 

N(x) = card { i : 1 < i < m, x(i) — w* } . 
In fact, the probability measure v is the image of \i through the map 

x G (A e ) m i ^ -N(x) G [0,1]. 
m 

Yet iV(a;) is also lumpable with respect to H, i.e., it can be written as a 
function of H(ie): 

Vie (A e ) m N(x) = N H (W(x)), 

where Nh is the lumped function defined by 

Vd G {0, ...,e} m N H (d) = cardji : 1 < i < m, d(i) = 0} . 

Let hh be the invariant probability measure of the process (D t )t>o- For 
de {0,...,e} m , we have 

H H (d) = lim P(D t = d) = lim P(W(X t ) = d) 

= lim P(X t G ET 1 ^)) = fitM- 1 ^)) . 

t— >co 

Thus, as it was naturally expected, the probability measure [in is the image 
of the probability measure fi through the map H. It follows that, for any 
function / : [0, 1] ->■ R, 

f fdv={ m f( —N(xj) d/j,(x) 
J [o,i] J(A l ) m ^rn 1 >) m > 

= f , f(-N H (d))d^ H (d). 

J{0,...,£} m Km 1 
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Similarly, the invariant probability measure [io of the process (Ot)t>o is 
the image measure of [i through the map 0, and also the image measure 
of through the map Oh- We have also, for any function / : [0, 1] — > R, 



Another advantage of the lumped processes is that the spaces { 0, . . . , I } m 
and VY+\ are naturally endowed with a partial order. Since we cannot 
deal directly with the distance process (D t )t>o or the occupancy process 
(Ot)t>o, we shall compare them with auxiliary processes whose dynamics 
is much simpler. 
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8 Monotonicity 



A crucial property for comparing the Moran model with other processes is 
monotonicity. We will realize a coupling of the lumped Moran processes 
with different initial conditions and we will deduce the monotonicity from 
the coupling construction. 

8.1 Coupling of the lumped processes 

We build here a coupling of the lumped processes, on the same probability 
space as the coupling for the process (X t )t>o described in section H) We 
set 

Vn > 1 R n = (S„, /„, J„, [/„,!,..., U n ^i) . 

The vector R n is the random input which is used to perform the n-th step 
of the Markov chain (X t )t>o- By construction the sequence (R n )n>i is a 
sequence of independent identically distributed random vectors with values 
in 

K = [0, 1] x {l,...,m} 2 x [0,1] £ . 

We first define two maps Mh and Sh in order to couple the mutation and 
the selection mechanisms. 

Mutation. We define a map 

M H :{0,...,£}x [0,l] £ ^{0,...,n 

in order to couple the mutation mechanism starting with different chromo- 
somes. Let b £ { 0, . . . , I } and let ui, . . . ,ug £ [0, Vf . The map M.h is 
defined by setting 

b i 

.Mg(fr, m, • ■ - , ui) = b - K h <p/ K + X! 1 «fc>i-p(i-i/^) ■ 

k=l k=b+l 

The map M.h is built in such a way that, if TJ\, . . . , Ui are random variables 
with uniform law on the interval [0, 1], all being independent, then for any 
b £ { 0, . . . ,£}, the law of M,n(b, U\,..., Ui) is given by the line of the 
mutation matrix Mjj associated to b, i.e., 

Vce {0,...,n P(M H (b,U 1 ,...,U e )=c) = M H {b,c). 

Selection for the distance process. We realize the replication mecha- 
nism with the help of a selection map 

S H :{0,...,£} m x [0, 1] ^ { 1, . . . , m } . 
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Let d € { 0, . . . , £ } m and let s e [0, 1[. We define S H (d, s) = i where i is 
the unique index in { 1 , . . . , m } satisfying 

A H (d(l)) + ■■■+ A H (d(i - 1)) < g < A*(d(l)) + • • • + A H (d(i)) 



A H (d(l)) + --- + A H (d(m)) ~ A H (d(l)) + ---+A H (d(m))' 

The map Sh is built in such a way that, if S is a random variable with 
uniform law on the interval [0, 1], then for any d € { 0, . . . , £ } m , the law of 
Sh (d, S) is given by 

Vi€{l,...,m} P(S H (d,S)=z) = ^M<*W) 



A H (d(l)) + ---+A„(d(m)) • 

Coupling for the distance process. We build a deterministic map 

$ H :{0,...,£} m x {0,...,£} m 

in order to realize the coupling between distance processes with various 
initial conditions and different parameters a or -p. The coupling map & H 
is defined by 

Vr = (s,i,j, Ul ,...,u e ) Vde {0,...,n m 

$ H (d, r) = d(j <- M H (d(S H (d, s)), m, . . . , u e )) . 

Notice that the index i is not used in the map The coupling is then 
built in a standard way with the help of the i.i.d. sequence {R n )n>i and 
the map Let rfe { 0, . . . , £ } m be the starting point of the process. We 
build the distance process (D t ) t > by setting D(0) — d and 

Vn>l D n = $ H (£>„_!, R n ) . 

A routine check shows that the process (D t )t>o is a Markov chain starting 
from d with the adequate transition matrix. This way we have coupled the 
distance processes with various initial conditions and different parameters 
a or -p. 

Selection for the occupancy process. We realize the replication mech- 
anism with the help of a selection map 

S :Vr+i x [0,1] -►{<),. ..,*}. 

Let o € V^li and let s e [0, 1[. We define So(o, s) = I where I is the unique 
index in { 0, . . . , £ } satisfying 

0(0)^(0) + • • • + o(l - l)A H (l - 1) o(0)A H (0) + ■■■+ o(l)A H (l) 

_ & ^ 



o(0)A H (0) + ■■■ + o(£)A H (£) ' o(0)A H (0) + ■■■ + o{£)A H {£) ' 
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The map So is built in such a way that, if S is a random variable with 
uniform law on the interval [0, 1] , then for any o <G T > i^_ 1 , the law of So{o, S) 
is given by 

Coupling for the occupancy process. We build a deterministic map 

$o : vt+x x n -+ vp +1 

in order to realize the coupling between occupancy processes with various 
initial conditions and different parameters a or p. The coupling map <£>o 
is defined as follows. Let r = (s, i,j,u\,..., ug) <G 1Z. Let o <G "Pt+i, let us 
set I = So (o, s) and let k be the unique index in { 0, . . . , i } satisfying 

o(0) + • • • + o(fc - 1) < j < o(0) + • • • + o(fc) . 

The coupling map $o is defined by 

$o(o, r) = o(fc -> Mh(1, mi, ... , Mf)) • 

Notice that the index i is not used in the map $o- Let o e P^Li be the 
starting point of the process. We build the occupancy process (0t)t>o by 
setting O(0) — o and 

Vn> 1 O n = ^o{O n -i,Rn) ■ 

A routine check shows that the process (O t ) t >o is a Markov chain starting 
from o with the adequate transition matrix. This way we have coupled the 
occupancy processes with various initial conditions and different parame- 
ters a or p. 

8.2 Monotonicity of the model 

The space { 0, . . . , I } m is naturally endowed with a partial order: 
d<e Vi e {!,..., to} d(i) < e(i) . 



Lemma 8.1 The map M.h is non-decreasing with respect to the Ham- 
ming class, i.e., 

Vfe,ce {0,...,n Vui,...,u/e[0,i] 

b < c M H (b, ui,...,ue) < Mh(c, ui,...,u e ). 
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Proof. We simply use the definition of M.h (see section 18. ip and we 
compute the difference 

Mh(c,u-l, ...,ui) - M H (b,ui, . . . ,u e ) = 

c 

C-b+ ^2 (l« fe >l-p(l-l/«;) _ ^UuKp/k) ■ 
k=b+l 

Since k > 2, the absolute value of the sum is at most c — b and the above 
difference is non-negative. □ 

Lemma 8.2 In the neutral case a = 1, the map Sh is non-decreasing with 
respect to the Hamming class, i.e., 

Vd,e e {0,...J} m Vs€[0,l] 

d< e d(S„(d,s)) < e(S„{e,s)) . 

Proof. In fact, when a = 1, the map Sh depends only on the second 
variable s: 

Vde {0,...,£} m Vse[0,l] S H (d,s) = [ms\ + 1. 
It follows that if d, e € { 0, . . . , I } m are such that d < e, then 

Vs € [0, 1] d([ms\ + 1) < e([ms\ + l) 
as requested. □ 

Lemma 8.3 In the neutral case a = 1, the map cf>H is non-decreasing with 
respect to the distances, i.e., 

Vd,e € {0,...,£} m VreK, d<e ^(d, r) < $ ff (e, r) . 

Proof. Let r = (s, i,j,ui,..., ug) e 7\L and let d, e <E { 0, . . . , £ } m , d < e. 
By lemma we have 

d(S H (d,s)) < e(S H {e,s)) . 
This inequality and lemma I5TT1 imply that 

M H (d(S H (d,s)),ui,...,Ui)) < M H (e(S H (e,s)),ui,...,ut) , 
so that 

d(j <- M H {d(S H (d, s)), ui, . . . , u e )) 

< e(j <r- M H {e{S H {e, s)), m, . . . , ui)) , 
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whence r) < $#(e,r) as requested. □ 

Unfortunately, the map is not monotone for a > 1. Indeed, suppose 
that 



then 



= 3, 

Ml, . 
H 



m = 6 

p _ 2p 
3' 3 



2 3 

3 <S< 4^ 

.7 = i = 1 . 




This creates a serious complication. This is why we perform a second 
lumping and we work with the occupancy process rather than with the 
distance process. We define an order ^ on V^_ 1 as follows. Let o = 
(o(0), . . . , o(£)) and d = (o'(0), . . . , d(£)) belong to Vf+ X . We say that o is 
smaller than or equal to o', which we denote by o ^ o', if 



VZ < I o(0) + • • • + o(l) < o'(0) + • • • + o'(0 . 



Lemma 8.4 The map So is non-increasing with respect to the occupancy 
distribution, i.e., 

Vo, d g T^j Vs e [0, 1] 

o<d =>■ So(o, s) > So(o',s). 

Proof. Let o ^ o'. Let i e { 0, . . . , t }. We have 

o(0)A H (0) + ■■■ + o(l)A H (l) = o(0)(<t - 1) + o(0) + • • • + o(i) . 

Thus 

o(0)A H (0) + ---+o(l)A H (l) , , 
o(0)A H (0) + ... + o(l)A H (i) = ^(°(°)' °(°) + ' ' ' + °(0) • 

where -0 is the function defined by 

77(17 — 1) + m 

The map -0 is non decreasing in 77 and £ on [0, m] 2 , therefore 

^(o(0), o(0) + • • • + o(0) < ^(o'(0), o'(0) + • • • + o'(0) , 
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i.e., 

o{0)A H (0) + --- + o(l)A H {l) d(0)A H (0) + ---+d(l)A H (l) 
o(0)A H (0) + --- + o{£)A H {£) ~ o'(0)A H (0) + --- + o'(£)A H (£) ' 

It follows that So(o, s) > So(o' , s) for any s £ [0,1]. □ 

Lemma 8.5 The map 4>0 is non-decreasing with respect to the occupancy 
distributions, i.e., 

Vo, d e Pjfi Vr e ^ o X d => *o(o, r) d *o(o', r) . 

Proof. Let 7- = (s, i, j, u\, . . . , u^) € 7?. and let o, o' € be such that 
o -< d . Let us set I = So(o,s), I' = So(o',s) and let k,k' be the unique 
indices in { 0, . . . , I } satisfying 

o(0) + • • • + o(* - 1) < j < o(0) + ■ • • + o(fe) , 
o'(0) + ■ • • + o'(fc' - 1) < j < o'(0) + ■ • • + o'(fc') . 

Since o < d , then k > k'. Let us set 

6 = .Mij(Z, ui, . . . , Ui) , b' = Mh(1', Ui,...,ui). 

Since I > V by lemma lK4l then 6 > 6' by lemma [5TT1 We must now compare 

$o(o, r) = o(* -> 6) , *oW,r) = d(k' -> 6') . 

Let fte {0, ...,£}. We have 

o(fc -). 6)(0) + ... + o(fc -> - o(0) + • • • + o(/i) - l fc < h + l 6 < h . 

Since o X o', then o(0) H h o{h) < o'(0) H h o'(/i). Since 6 > 6', then 

lfc</i < lb'</i- The problem comes from the indicator function lk<h- We 
consider several cases: 

• k <h. Then 

o(0) + ■ ■ • + o(/i) - lfe<h + W < o'(0) + • • ■ + o'(/i) - 1 + h< h 

< o'(0) + • • ■ + o'(/i) - l fe /< h + l b ,< h . 

• k' < h < k. The definition of k, k' implies that 

o(0) + • • • + o(/i) < j < o'(0) + • • • + d(h) 

whence 

o(0) + • • • + o(h) < o'(0) + • • • + o'(/i) - 1 . 
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It follows that 



o(0) + ■ ■ • + o{h) + l b < h < o'(0) + ■ ■ • + o'(h) - l k ,< h + l w < h . 

• h<k'. Then 

o(0) + • • • + o(h) + l b < h < o'(0) + • • • + o'{h) + ly< h . 
In each case, we have 
o(k -> 6)(0) + ■ • • + o(fc -> b)(h) < o'(k' -> b')(0) + ■■■+ o'(k' -> b'){h) . 

Therefore $0(0, r ) d $0(0', as requested. □ 

Let us try to see the implications of the previous results for the mono- 
tonicity of the model (see the appendix for the definition of a monotone 
process). There is not much to do with the original Moran model, because 
its state space is not partially ordered. So we examine the distance process 
and the occupancy process. 

Corollary 8.6 In the neutral case a = 1, the distance process (D t )t>o is 
monotone. 

Indeed, by lemma 18.31 the map is non-decreasing in the neutral case 
(7=1, hence the coupling is monotone. Unfortunately, we did not manage 
to reach the same conclusion in the non neutral case. The main point of 
lumping further the distance process is to get a process which is monotone 
even in the non neutral case. 

Corollary 8.7 The occupancy process (Ot)t>o is monotone. 

By lemma [ST51 the coupling for the occupancy process is monotone. 
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9 Stochastic bounds 



In this section, we take advantage of the monotonicity of the map $o to 
compare the process (Ot)t>o with simpler processes. 

9.1 Lower and upper processes 

We shall construct a lower process (Of) t >o and an upper process (Oj) t >o 
satisfying 

Vi > 0\ < O t < O] . 

Loosely speaking, the lower process evolves as follows. As long as there is 
no master sequence present in the population, the process (O})t>o evolves 
exactly as the initial process (O*)t>o- When the first master sequence 
appears, all the other chromosomes are set in the Hamming class 1, i.e., 
the process jumps to the state (1, m — 1,0,..., 0). As long as the master 
sequence is present, the mutations on non master sequences leading to non 
master sequences are suppressed, and any mutation of a master sequence 
leads to a chromosome in the first Hamming class. The dynamics of the 
upper process is similar, except that the chromosomes distinct from the 
master sequence are sent to the last Hamming class I instead of the first 
one. We shall next construct precisely these dynamics. We define two maps 
*t,*i :VT+ X ^V%X X by setting 

Vo e VT+\ Jr/(o) = (o(0), 0, . . . , 0, m - o(0)) , 

tti(o) = (o(0),m-o(0),0,...,0). 

Obviously, 

Voe?f +1 Hl{o) < O X 7Tl(0). 

We denote by W* the set of the occupancy distributions containing the 
master sequence, i.e., 

W* = { o e VT+i ■■ o(0) > 1 } 

and by Af the set of the occupancy distributions which do not contain the 
master sequence, i.e., 

N = {o e : o(0) = } . 

Let $o be the coupling map defined in section IQ1 We define a lower map 
$q by setting, for o € Vf^_i and r e 1Z, 

{$o(o,r) if o€j\f and $o(o,r)^W* 

7r £ ($ (o,r)) if oe A/" and $ (o,r)GW* 
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Similarly, we define an upper map $q by setting, for o G V^] rl and r G 1Z, 




$0(0, r) ifoeTV and $ (o, r) ^ W* 

7ri($ (o, r)) ifoGTV and $0(0, r) G W* 

7Ti($ (7ri(o),r)) ifoGW* 



A direct application of lemma [53] yields that the map $q is below the map 
$0 and the map <&q is above the map $0 in the following sense: 

We define a lower process (Of) t >o and an upper process (Oj)t>o with the 
help of the i.i.d. sequence {R n )n>i and the maps & e , $q as follows. Let 
o G Vfl 1 be the starting point of the process. We set O^(0) = O 1 (0) = o 
and 

Vn > 1 = ^ {O l n _^R n ) , C£ = ^(Oi-i.^) ■ 



Proposition 9.1 Suppose that the processes (Of) t >o, (Ot)t>o, (C*t )t>o, 
start from the same occupancy distribution o. We have 

Vi > Of ^ O t r< . 

Proof. We prove the inequality by induction over n G N. For n = we 
have O(0) = O e (0) — O 1 (0) = o. Suppose that the inequality has been 
proved at time t = n G N, so that ^ O n ^ 0*. By construction, we 
have 

O l n+1 = ^o(O e n ,Rn) , On+1 = <Z>o(O n ,Rn) , 0^+1 = ^(0^) ■ 

We use the induction hypothesis and we apply lemma 18.51 to get 

$o{O e n ,Rn) 1 $ o (0„,i?„) ^<5>o{Ol,R n ) . 

Yet the map <& is below the map $o and the map <£>q is above the map 
$o, thus 

Putting together these inequalities we obtain that O l n+l < O n +i d: 0n+i 
and the induction step is completed. □ 
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9.2 Dynamics of the bounding processes 

We study next the dynamics of the processes (Of) t >o and (O]) t >o in W*. 
The computations are the same for both processes. Throughout the section, 
we let 9 be either 1 or I and we denote by (Of) t > the corresponding 
process. For the process (Of )t>o, the states 

T e = {oe VT+i ■ o(0) > 1 and o(0) + o{6) < m } 

arc transient, while the populations in J\f\J (W* \ T e ) form a recurrent 
class. Let us look at the transition mechanism of the process restricted to 
W* \T e . Since 

W*\T 9 = { o e VT+i ■ o(0) > 1 and o(0) + o(9) = m } , 

we see that a state of W* \ T e is completely determined by the first occu- 
pancy number, or equivalcntly the number of copies of the master sequence 
present in the population. Let o® ntoI be the occupancy distribution having 
one master sequence and m — 1 chromosomes in the Hamming class 9: 



We{0,...,n o c 9 ntor (0 



1 if I = 

m - 1 if Z = 6» 
otherwise 



The process (Of ) t > always enters the set W* \ T 9 at o® ntcI . The only 
possible transitions for the first occupancy number of the process (Of )t>o 
starting from a point in W* \ T e are 

o(0) — > o(0)-l, l<o(0)<m, 
o(0) — >• o(0) + 1 , < o(0) < m - 1 . 

Let of xit be the occupancy distribution having m chromosomes in the Ham- 
ming class 9: 



V/e{0,...,n o^. t (l) 



ill = 9 
otherwise 



The process (Of ) t >o always exits W* \ T e at o^ xit . From the previous 
observations, we conclude that, whenever (Of) t >o starts in W* \T e , the 
dynamics of (Of(0)) t >o is the one of a standard birth and death process, 
until the time of exit from W * \ T e ■ We denote by ( Zf ) t > o a birth and death 
process on {0, ...,m} starting at Zq = 1 with the following transition 
probabilities: 
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Transitions to the left. For is{l,...,m}, 

P(Z? +1 =i-l\zf = i) = P(O e t+1 (0) =i-l\ Of (0) = i) 

ai 2 (l - M H (0,0)) +i(m-i)(l - M H (6,0)) 
m(ai +m — i) 

Transitions to the right. For ig{0,...,m- 1}, 

P(zf +1 = i + 1 1 Z° = i) = P{O e t+1 (0) =i + l\ O?(0) = i) 

ai(m — 



m(ai + m — i) 



9.3 A renewal argument 



Let (X t )t>o be a discrete time Markov chain with values in a finite state 
space £ which is irreducible and aperiodic. Let \i be the invariant proba- 
bility measure of the Markov chain (X t )t>o- 

Proposition 9.2 Let W* be a subset of £ and let e be a point of £ \ W*. 
Let / be a map from £ to R which vanishes on£\ W* . Let 

t* = inf { t > : X t G W* } , r = inf { t > t* : X t = e } . 

We have 



f/(*)M*) = E{T{ l = e) E(l T J(x s)ds 



X = e). 



Proof. We define two sequences (r%)k>i, (rk)k>o of stopping times by 
setting r = and 

t* = inf { t > : X t € W* } , n = inf { t > t{ : X t = e } , 

r fe * - inf { t > r fc _x : X t e W* } , T k - inf { t > r* k : X t - e } , 



Our first goal is to evaluate the asymptotic behavior of Tk as k goes to oo. 
For any k > 1, by the strong Markov property, the trajectory (Xi) t > Tjc of 
the process after time Tk is independent from the trajectory (X t )t< Tk of the 
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process until time Tk , and its law is the same as the law of the whole process 
(X t )t>o starting from e. As a consequence, the successive excursions 

{X t , T k <t < Tfe+l) , k > 1 , 

are independent identically distributed. In particular, the sequence 

(n+i -7*) fc >i 

is a sequence of i.i.d. random variables, having the same law as the ran- 
dom time n whenever the process (X t )t>o starts from e. For fc > 1, we 
decompose Tk as the sum 

fc-i 

Tk = r i+^2 ( Th+i ~ r/i ) • 

h=l 

We denote by E e (-) the expectation for the process (X t )t>o starting from e. 
Since the state space £ is finite, then the random time n is finite with 
probability one, and it is also integrablc. Applying the classical law of 
large numbers, we get 

lim — = EJti) with probability 1. 

Whenever the process (X t )t>o starts from e, the random times r* , n satisfy 
r i > 1, T i > 2, therefore the expected mean E e (n) is strictly positive and 
we conclude that 

lim Tk = +oo with probability 1. 

fe— >oo 

We define next 

Vi > if (t) = max { k > : r fe < t } . 

From the previous discussion, we see that, with probability one, K(t) is 
finite for any t > 0. From the very definition of K(t), we have 

Vt > T K ( t ) < t < T K{t) + 1 , 

and since Tk goes to oo with k, then 

lim K(i) = +oo with probability 1. 

t— >oo 

We rewrite the previous double inequality as 

T K(t) J_ rgw+1 gft) + 1 

if(t) " /£"(*) if(t) + l ' 
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Sending t to oo, we conclude that 

lim — — = — — — r- with probability 1. 

t->oo t E e [Ti) 

We suppose that the process (X t )t>o starts from e. Let / be a map from 
£ to R which vanishes on £ \ W*. By the ergodic theorem IA. 21 we have 

1 r* 

lim E e (f(X t )) = lim - / f{X s )ds. 
We decompose the last integral as follows: 

/ f(X s )ds = J2 / f(X.)d8 + / f(X s )ds, 

AC— 1 fe if(t) + l 



where ^ ^ stands for min(r^^^ +1 , t). For fc > 1, the integral 

N k = I " f(X s )ds 



is a deterministic function of the excursion (X t , t^-i < t < 77-J, hence 
the random variables (Nk,k > 1) are independent identically distributed. 
With probability one, K (t) goes to 00 as t goes to 00, thus by the classical 
law of large numbers, we have 

1 K(t) 

lim -— ^ N k = E e (Ni) with probability 1. 

t-y oo Kit) z — ' 

v ' k=l 

Writing 

rt jsu\ 1 1 /-t 



tJo t K{t) tJ T . 



k=i 

and letting i go to 00, we conclude 



/TO da, 

, At 



1 E (iVi) 

lim - / f(X s )ds — — r- with probability 1 . 

This yields the desired formula. □ 
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9.4 Bounds on v 

We denote by [Iq, \i , H Q the invariant probability measures of the pro- 
cesses (Ot)t>o, (O t )t>o, (Of )t>o- From section [7^1 the probability v is the 
image of no through the map 

o G ^ -o(0) G [0, 1] . 

m 

Thus, for any function / : [0, 1] — > K, 

/ fdu = [ f(^) dno(o) = lim E(f(-O t (0))) . 
J [o,i] •'T'l+i m 

We fix now a non-decreasing function / : [0, 1] — > K such that /(0) = 0. 
Proposition 19. II yields the inequalities 

w>o /(i f(o))</(i 0t (o))</(lo tl( o)). 

Taking the expectation and sending t to oo, we get 

/ f(° M )d, (o)< i f du< r f(°-^) d ,uo). 

J V ™ +i V m J J [0A] J-pn^ V m / 

We seek next estimates on the above integrals. The strategy is the same 
for the lower and the upper integral. Thus we fix 9 to be either 1 or £ and 
we study the invariant probability measure n . For the process (Of)t>o, 
the states of T e are transient, while the populations in AfU (W* \ T e ) 
form a recurrent class. We apply the renewal result of proposition 19.21 to 
the process (O 9 t ) t >o restricted to Wu (W* \T e ), the set W* \ T e , the 
occupancy distribution Og xit and the function o i-> /(o(0)/m). Setting 

r* = inf { t > : 0\ G W* } , 
r = inf { t > r* : 0\ = o e cxit } , 

we have 




Yet, whenever the process (Of ) t > is in W*\T 6 ', the dynamics of (Of (0)) t > 
is the same as the birth and death process (Z®) t >o defined at the end of 
section 19^21 We suppose that (Zf) t >o starts from Z® — 1. Let t be the 
hitting time of 0, defined by 

r = inf { n > : Z e n = } . 
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The process (Of)t>o always enters W* at o® ntCT and it always exits W*\T e 
at of xit . In particular r coincides with the exit time of W* \ T e after r*. 
From the previous elements, we see that (Oj?(0), t* < t < r) has the same 
law as (Z 9 t , < t < tq) , whence 



e, i f mu, 



A- 



7" 
A) 



Moreover, using the Markov property, we have 

e(t-t* I O 9 = of xit ) = s(t I O e = f nter ) = ^(ro I Z O = 1 

Reporting back in the formula for the invariant probability measure /j,q, 
we get 



zt 



i f(°-^)^ (o)= y° t ds ( , 

JVP +1 j U ) E(r* | 0% = o c e xit ) + E(r \ Z B = l) 

In order to reinterpret this formula, we apply the renewal result stated in 
proposition to the process (Z t e ) t > , the set { 1, . . . , m }, the point and 
the map f(-/m). Setting 

Ti = inf { t > : Z\ = 1 } , 

and denoting by v e the invariant probability measure of (Z®) t >o, we have, 
with the help of the Markov property, 



z=l 

Yet 

E( n 1 4 = 0) 

We conclude finally that 
l r p m \ m J 



El [ /(^). 

Jo v m 



zl = \ 



E( n | zl = o) + e(t q I z e = l) 
1 1 



p{z{ = \\zl = ti) m h (o,o) 



' E(r | Zl = 1) 



E{t* I Og = dU) + E (r | ^ = 1) hi {m) 



To estimate the integral, we must estimate each term appearing on the 
right-hand side. In section [TU1 we deal with the terms involving the birth 
and death processes. In section [TT1 we deal with the discovery time r*. 
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10 Birth and death processes 



We first give explicit formulas for a birth and death Markov chain that are 
well adapted to our situation. The formula for the invariant probability 
measure can be found in classical books, for instance [21] . 



10.1 General formulas 

We consider a birth and death Markov chain (Z n ) n >o on the finite set 
{ 0, . . . , m } with transition probabilities given by 

P(Z n+1 = i + 1 1 Z n = i) = Si , < * < to - 1 , 
P(Z n+1 = i - 1 1 Z n = i) = 7f , 1 < i < m , 

for any n > 0. We define 

S 6 

7r(0) = 1 , 7T(z) = 1 "' 1 , 1 < i < TO - 1 . 

7i • ' • 7i 

Let To be the hitting time of 0, defined by 

r = inf { n > : Z n = } . 
We have the following explicit formula for the expected value of tq: 



£(t I Z = 1) = - 1) ■ 



Let be the invariant probability measure of (Z n )n>o- We have the fol- 
lowing explicit formula for v: 

1 



"(0) = 



1 + 6 E(t \Z = 1) 



\fi € { 1, . . . , rn } i'(i) 



7T(l — 1) 

1± 



1 + S E(t I Z Q = 1) 



10.2 The case of (Z e t 



We will now apply these formula to the birth and death chains (Z®) t >o 
introduced at the end of section For these two processes, we have the 
following explicit formula for the transition probabilities: 

ai 2 (l-M H (0 1 0))+i(m-i)(l-M H (6 1 0)) 
7» = — 7-t^ ' 1 - 1 - m ' 

m(<ji + m — i) 
ai{m-i)M H {0,0) + {m-i) 2 M H {e,0) 

Oi = -. r , < I < TO — 1 . 

myui + to — i) 
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The transition probabilities 5i, ji depend on the parameters a, £, to, q as well 
as 9. We seek estimates of the expected value of t and of the asymptotic 
behavior of v in the regime where 

m,£ — > +00 , 

For this reason, we choose the above specific forms of the formulas, which 
are well suited for our purposes. Since the results are the same for 9 = 1 
and 9 = £, we drop the superscript 9 from the notation, and we write 
simply Z n , v instead of v b '■ Our first goal is to estimate the products 
We start by studying the ratio <5,/7,. We have 

Vie{l,...,m-1} - - cf>(M H (0,0),M H (6,0),-) , 

7i V to/ 

where : ]0, 1] x [0, l[x]0, 1[— >]0, +oo[ is the function defined by 

(1- p)(app+(l- p)e) 



p(a(l-/3)p+(l-p)(l-e)) ' 



What matters for the behavior of the products n(i) is whether the values of 
<f> are larger or smaller than 1. The equation <f>(/3, e, p) = 1 can be rewritten 

as 

(cr - l)p 2 + (l-aP + e)p-e = . 
This equation admits one positive root, given by 

P(f3,e) - - 1 - e + - 1 - £ ) 2 + 4 £ (a - 1)) . 

Therefore we have 

4>(0,e,p)>l if p<p(0,e), 

<f>(P,e,p)<l if p>p(0,e). 

This readily implies that 

1 < * < j < W, e)m\ => 7r(i) < 7r(j) , 
[p(/3, e)mj <i<j<m ir(i) > ir(j) , 

^(Lp(/3,£)toJ) >7t(Lp(/3,£)toJ +1). 

It follows that the product w(i) is maximal for i = \_p(0, e)m\ : 

max 7r(i) = 7r([/o(/3, e)mj) . 

KKm 
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We notice in addition that </>(/3, e, p) is continuous and non-decreasing with 
respect to the first two variables f3,s. In the next two sections, we com- 
pute the relevant asymptotic estimates on the birth and death process. 
Lemma 17.11 yields 

M H (0,0) = (l-p(l 

M H (1,0) = 

As in theorem 13. 1[ we suppose that 

£ — > +00 , m — > +00 , q — » , 

in such a way that 

Iq^f a e]0, +00 [. 

In this regime, we have 

lim M H (Q,0) = exp(-a), 

t— 7-00, q— >-0 

lim M H (1,0) = lim M H (l,0) = 0. 

iq— ya tq-^a 

10.3 Persistence time 

In this section, we will estimate the expected hitting time E(tq \Zq = 1). 
This quantity approximates the persistence time of the master sequence w* . 
We estimate first the products tt(i). 

Proposition 10.1 Let a e]0, +00 [. For p £ [0, 1], we have 

1 f p 
lim — In 7t(L/OtoJ ) = / ln0(e a ,0,s)ds. 
e,m-KX) m Jo 

q— >0, lq-^ra 

Proof. Let p £ [0, 1]. For m > 1, we have 
1 1 ^ P " 1 ^ 

-lnTr(LpmJ) = - V l n( /.(Afe(O,O),M ff (0,O),-) . 

Let e G ]0, e~ a [. For ^, m large enough and g small enough, we have 
|Afe(0,0)-e- a | < e, < M H (0,O) < e, 



M H (£,0) = 
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therefore, using the monotonicity properties of <j>, 



— V \n(f)(e- a - e,0,— ) < — lnTr(lpmJ) 

< l L gi n 0( e - + e;£; __). 

TO V TO/ 



[pmj 



TO "< — ' V TO/ 

2=1 



These sums are Riemann sums. Letting £, to go to oo and g go to 0, we get 

fp 



1 /" p 

liminf — ln7r(L/OTOj) > / ln</>(e a —s,0,s)ds, 
e,m—>-cc m Jo 

-S-0, iq-^a 

1 f P 

limsup — \mr([pm\) < / ln0(e a +e,e,s)ds. 



'0 

q— ^0, £q^a 

We send e to to obtain the result stated in the proposition. □ 
We define 

ae- a - 1 



p*(a) = p(e-°,0) = < a -I * ° e ° >X 

if cre- a < 1 

Since </>(e~ a ,0,s) > 1 for s < p*(a) and 0(e~ a ,O,s) < 1 for s > p*(a), then 
the integral 

/ ln0(e" a ,O,s)ds 
Jo 

is maximal for p = p* (a) . 

Corollary 10.2 Let a e]0,+oo[. The expected hitting time of starting 
from 1 satisfies 

lim — IilE(t \Z q = 1) = / ln0(e~ a ,O, s) ds . 
e,m-nx> m J 

q— S-0, lq-^a 

Proof. We have the explicit formula 

m 

E(T Q I Z = 1) = ^ -7T(i - 1) 

i=l 7 * 

and the following bounds on 7$: 

v, 6{1 ,..., m} hMI £ , s „. 
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Let £ G ]0, e °[. For £, m large enough and q small enough, we have 

\M H (0,0) - e~ a \ < e, < M ff (^0) < e. 
We first compute an upper bound: 

" 1 - M g 3 (0, 0) n ( M Mg (°' 0), Mh(0, 0))mj) . 
Using the monotonicity properties of 0, we get 
n([p(M H (O,O),M H (0,O))m\) 

[p(M H (0,0), M H (6, 0))m\ 

H 4>(Mh(P,O),M h (0,O),±) 

1=1 

Lp(MH(0,0),MjT(9,0))mJ 

* n *( e " o+e ^) 

i=l 



[p(e a +e,s)m\ 



»=i 

The last inequality holds because the product n(i) corresponding to the 
parameters e~ a + e, e is maximal for i = [p(e~ a + e, e)m\ . We obtain that 

v ! i—l 

Taking logarithms, we recognize a Riemann sum, hence 

I rp(e^ a +e,e) 

limsup — ln^(ro|^o = l) < / In </>(e~ a + e, e, s) ds . 

^,m— s-oo m JO 

Conversely, 

lp(e- a ,0)m\ 



E(t \Z = 1) > JJ 0(m h (O,O),Mh(0,O),^) 

i=i 

Lp(e _a ,0)mj 

n ♦(•--••■s) 
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Taking logarithms, we recognize a Riemann sum, hence 

liminf — \uE(t \ Z a = 1) > / In 4>(er a - e, 0, s) ds . 

i,m—Hx to J 

q— >0, iq— >a 

We let e go to in the upper bound and in the lower bound to obtain the 
desired conclusion. □ 



10.4 Invariant probability measure 

In this section, we estimate the invariant probability measure of the process 
(Zf)t>o, or rather the numerator of the last formula of section l^CT As usual, 
we drop the superscript 9 from the notation when it is not necessary, and 
we put it back when we need to emphasize the differences between the cases 
9 = i and 9 = 1. We define, as before corollary 1 10. 2[ 

ae- a - 1 

,» = P (e-o) = <{ -T=T ifffe ~ a>1 

if aer a < 1 

Let / : [0, 1] — > R be a non-decreasing function such that /(0) = 0. Wc 
have the formula 



s ° E /(-)-"(*"!) 

r — r Vto/ t,- 



( " \ / .n Ki<m 



V m y w 1 + <S S(r | Z = 1) 

l<i<m 

Moreover Sq — My ((9,0), thus the numerator of the last formula of sec- 
tion 19.41 can be rewritten as 



2= 1 l<i<m 

Our goal is to estimate the asymptotic behavior of the right-hand quantity. 



Proposition 10.3 Let / : [0, 1] — > R be a continuous non-decreasing func- 
tion such that /(0) = 0. Let a g]0, +00 [. We have 



' Vto / 



.to/ 7, 

<,m->oo £(t I Z = 1) V V ' 

q— >0, lq-^ra 
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Proof. Throughout the proof, we write simply p* instead of p*(a). Let 
rj > 0. For £, m large enough and q small enough, we have 



\p*- p(M H (0,0),M H (9,0))\ < V: 



whence 

l)i 



E /(-)--(*- 1) 

l<i<m 11 

E E f(-)-<i-i) 

z — 4 \m/ 7^ * — ' Vm/ 7^ 

l<i<m l<i<m 

|i/m— p*|<?7 |i/m— p*|>?7 

E /0>*+»7)-t(<-1)+ £ /(i)l^_i) 



l<t<m 1 < £ < m 

|i/m— p* |<?7 |*/m— p* |>r; 

< f(p* + v )E(r \Zo = i)+ 

m 3 /(l) 



1-M ff (0,0) 



^(IV - 7 ?) m J) + t(L(p* + ??)mj) 



To obtain the last inequality, we have used the monotonicity properties of 
7r(i) and the bounds on jj given at the beginning of the proof of corol- 
lary [TITU The properties of <j> and the definition of p* imply that 



p 

\n(j){e- a 1 Q,p)dp > 



p*-n rp*+n 

hicj>{e- a 1 Q,p)dp, I \ncj>(e- a 1 Q,p)dp 



so that, using proposition 1 1 . l\ for m large enough, 
m 3 /(l) 



1 -"mh(0,0) (*^(P* ~ V)m\) + n{[(p* + r?)mj)) < j^fo | Z„ = 1) . 



Adding together the previous inequalities, we arrive at 

i)l 



E /(^) - !) ^ (/(P* + »?) + l) E <W I = 1) • 



l<z<m 

Passing to the limit, we obtain that 

i)i 



E /(-)-*(*- d 



limsup ~ ~ — — - < f(p +rj)+r). 

£,m-nx, &{T | Zq = J.J 

g— J-O, £<j— s-a 
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We seek next a complementary lower bound. If ae a < 1, then p* = 0, 
and obviously 



l<i<m 

If ae~ a > 1, then p* > and 



ln(j)(e- a ,0,p)dp > / ln0(e _a ,O,p)dp. 
By corollary 1 10. 2[ for £, m large enough and q small enough, 

E " 1) < q) "^* - ^M) < ^(ro I = 1) ■ 



7* 1-M H (0,0)' 

ijrn—p* < — rj 

Combining these inequalities, we obtain 



J2 /(-)-Tr(i-l) > E /(p*-»7)— tt(*-1) 

i/m— p* > — 77 

= /(p* - 1?) f E - !) - E ^ - 1 



KKm * Ki<m * 



ifm—p* < — t; 

> f(p*-r,)E(T \Z = !)(!- t?). 



Passing to the limit, we obtain that 



liminf — — > /(/»* - 77) (1 - 77) . 

e,m->-oo &(Tq \ Zq = 1) 

q— >0, iq—ta 

We finally let 7/ go to in the lower and the upper bounds to obtain the 
claim of the proposition. □ 
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11 The neutral phase 



We denote by Af the set of the populations which do not contain the master 
sequence w* , i.e., 

Af=[A e \{w*}) . 

Since we deal with the sharp peak landscape, the transition mechanism of 
the process restricted to the set Af is neutral. We consider a Moran process 
{X n )n>o starting from a population of Af. We wish to evaluate the first 
time when a master sequence appears in the population: 

r* = inf { n > : X n <£ Af } . 

We call the time t* the discovery time. Until the time , the process evolves 
in Af and the dynamics of the Moran model in AT does not depend on a. 
In particular, the law of the discovery time t* is the same for the Moran 
model with a > 1 and the neutral Moran model with u = 1. Therefore, we 
compute the estimates for the latter model. 

Neutral hypothesis. Throughout this section, we suppose that a = 1. 
11.1 Ancestral lines 

It is a classical fact that neutral evolutionary processes are much easier to 
analyze than evolutionary processes with selection. The main reason is that 
the mutation mechanism and the sampling mechanism can be decoupled. 
For instance, it is possible to compute explicitly the law of a chromosome 
in the population at time n. 

Let no be an exchangeable probability distribution on (A e ) ■ Let (X n ) n > 
be the normalized neutral Moran process with mutation matrix M and 
initial law [i . Let v a be the component marginal of fi - 

Vu e A e is (u) = Mo(j> e (A e ) m ■. x(l) = u}) . 

Let (W n ) n >o be a Markov chain with state space A , having for transition 
matrix the mutation matrix M and with initial law v . Let (e„) n >i be a 
sequence of i.i.d. Bernoulli random variables with parameter 1/m: 

Vn > 1 P(s n = 0) = 1 - — , P(e n = 1) = - 

m m 

and let us set 

Vn > 1 N(n) = £i H h e„ . 

We suppose also that the sequence (e„)„>i and the Markov chain (W„)„>o 
are independent. 
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Proposition 11.1 Let i £ { 1, . . . , m }. For any n > 0, the law of the i-th 
chromosome of X n is equal to the law of WWn)- 

Proof. We start by computing the transition matrix of the process 
(Wjv(n))n>o- For u <E A 1 and n > 0, 

P(W N(n+1) = u) = P(W N(n+1) = u, e n+1 = 0) 

+ P{W N{n+ i) = u, e n+1 = 1) 

= (l - -)p(W N(n) =u) + -P(W N(n)+1 = u) . 

Moreover 

P{W N (n)+i = u ) = Y P ( W N(n) + l = u, W N{n) = v) 

v€A e 

= Y P(W N(n)+1 = u I W N(n) = v)P(W N(n) = v) 
veA e 

= Y P(W N{n) =v)M(v,u). 
veA e 

Therefore the transition matrix of the process (Wjv(„))„>o is 

V to/ to 

where I is the identity matrix. We do now the proof by induction over n. 
The result holds for n = 0. Suppose that it has been proved until time n. 
Let i £ { 1, . . . , to }. We have, for any u e A 1 , 

p(x n+1 (*) = «)= ]T P(x n+1 (i) = «, x n = x) 

= Y, P(Xn+i(i)=u\X n = x)P(X n =x). 
xe(A 1 )™ 

Yet we have 

P{X n+1 (i)=u\X n = x) = Ci-l)i x(i)=u + -L V M(x(j),u). 

\ TO/ TO f— ? 

l<j<m 



Gl 



Thus 



P(X n+1 (*)=«)= i 1 -^) Ui)=uP(X n = X) 

+ E i E M(x(j),u)P(X n = x) 

xe(A')™ l<j<m 

= (i-!)p(x n (i) = u)+ V \ V m(«, u )p(x„(j)-«)- 

\ m/ z — ' to * — ' 
By the induction hypothesis, 

\/v€A e Vj G { 1, • • • , TO } P(X„(j) = w) = P(Wjv(n) = w) , 

whence 

P(X n+ i(i) = u) = 

(l - l)p(W^ (n) = «) + 1 £ P ( W N(n) = «)M( U)U ) 

= P{W N{n+l) = u) . 
The result still holds at time n + 1 . □ 

We perform next a similar computation to obtain the law of an ancestral 
line. Let us first define an ancestral line. For i e { 1, . . . , to } and n > 1, 
we denote by n,n— 1) the index of the ancestor at time n — 1 of the 
i-th chromosome at time n. Let us explicit its value. If X n _i = x and 
X n = y with y = x(j u), where the chromosome u has been obtained 
by replicating the fc-th chromosome of x, then 



n, n — 1) 



i if i ^ j 
fc if i = j 



For s < n, the index l(i,n,s) of the ancestor at time s of the i-th chro- 
mosome at time n is then defined recursively with the help of the following 
formula: 

n, s) = n,n—l),n— 1, s) . 

The ancestor at time s of the i-th chromosome at time n is the chromosome 

ancestor(i, n, s) = X s (l(i,n, s)) . 

The ancestral line of the i-th chromosome at time n is the sequence of its 
ancestors until time 0, 

(ancestor(i, n, s), < s < n) = (X s (l(i, n, s)), < s < n) . 



G2 



Proposition 11.2 Let i € {l,...,m}. For any n > 0, the law of the 
ancestral line (ancestor(i, n, s), < s < n) of the i-th chromosome of X n 
is equal to the law of (W;v(o) , . . . , Wjv(n))- 

Proof. We do the proof by induction over n. The result is true at rank 
n = 0. Suppose it has been proved until time n. Let i G { 1, . . . , m } and 
let Uo, ■ ■ ■ , u n +i € A 1 . We compute 

P(ancestor(i, n + 1, s) = u s , < s < n + l) 



- E E * 
= E E ^ 

le(yt')™ l<J<m 



-X"n+i(«) = ttn+i, I(i,n+ l,n) = j 
X„ = x, ancestor(j, n, s) = u s , < s < n 

X n+ i(i) = u n+ i ancestor(j, n, s) = u 
n + 1, n) = j < s < n, X n = x 

/ancestor(j, n, s) = u 



x p /ancestor^, n, s) = u s \ 
\ < s < n, X n = x J ' 

Since we deal with the neutral process, we have 



P 



X n +i(i) = u n+ i ancestor(j, n, s) = u 
n + 1, n) = j < s < n, X n = x 

X n+ i(i) = u n+ i 
n + 1, n) = j 



X n = x 



ft 1 - 1 ) 
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Reporting in the previous equality, we get 
P(ancestor(i, n + 1, s) = w s , < s < n + l) 



E/ _ J_\ /ancestor(i, n, s) = u s \ 

I m) x(l) = Un+1 \ < s < n, X n = x ) 
(A t )m - - 

Ev~^ 1 ,., . „ /ancestor (7, n, s) = u,\ 

x g(^)m l<j<m 

■ (l - — ) l Mll=Url+1 P(anccstor(z, n, s) = u s , < s < n) 

+ ^ — ^M(u n , u n +i) P(ancestor(j, n, s) = u s , < s < n) . 



l<j<m 



By the induction hypothesis, we have, for any j £ {1, - ■ ■ ,m}, 
P(ancestor(i, n, s) = u a , < s < n) = P(W N ^ =uo,..., W N ( n ) = u n ) 
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Therefore 

P(ancestor(i, n + 1, s) = u s , < s < n + l) = 

P(WiV(„+l) = Mn+l I ^iV(n) = "n)-P(WiV(0) = "0, • • ■ , Wjf( n ) = «n) 

= ^(WaT(O) = "0, • • • , W / A r (n+1) = U n+ i) 

and the induction step is completed. □ 
11.2 Mutation dynamics 

Throughout the section, we consider a Markov chain (Y n ) n > with state 
space { 0, . . . , i } and having for transition matrix the lumped mutation 
matrix Mjj. By lemma I7.lt for b, c G { 0, . . . ,£ }, the coefficient Mn(b, c) 
of the matrix Mh is equal to 




0<h<b 
k—h—c—b 



Such a Markov chain can be realized on our common probability space. Its 
construction requires only the family of random variables 

(U n , h n>l,l<l<l) 

with uniform law on the interval [0, 1]. Let b € { 0, . . . ,£} be the starting 
point of the chain. We set Y = b and we define inductively for n > 1 

Y n -l e 

Y n = Y n _! - ^2 !if„, fc <p/K + X] 1 c/„,j ! >i-p(i-i/k) 

k=l fe=Y„_i + l 

= M H (Y n -i,U nt i,...,U nti ) . 

By lemma 18. 1[ the map AAh is non-decreasing with respect to its first 
argument. Thus the above construction provides a monotone coupling of 
the processes starting with different initial conditions and we conclude that 
the Markov chain (Y n ) n >o is monotone. 

Proposition 11.3 The matrix Mh is reversible with respect to the bino- 
mial law B{£, 1 — 1/k) with parameters I and 1 — l/n. This binomial law 
is the invariant probability measure of the Markov chain (Y n ) n >Q. 

Notation. We denote simply by B the binomial law B(£, 1 — 1/k). Thus 
V6e{0,...,.} B(6)=Q(l-I) 6 Q < - 6 . 
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Proof. We check that the matrix Mjj is reversible with respect to B. 
Let b, c <E { 0, . . . , £ }. We use the identity 

t\f£-b\fb\ t\ 



MM {£-b-k)\(b-h)\ 



to write 



B(( ,)M H ( I ,,c) = Q(i-i) i Q'- i x 

x r")0(^4))'o-^4)r i © k ('-!) 

U^0_K V / \ / 



6- ft 



0<fc<£-b 

0<h<6 
k—h—c—b 



, , / I \ o+K / I \ t—o+n, 

e ip k+h (i--) (-) _ _±\y-^v 

fc!/i!«-6-fc)!(6-/i)! I 1 *V JJ I 1 J ' 



:(£-b-k)\(b-h) 

0<h<b 
k—h—c—b 

We eliminate the variable h = k + b — cin this formula: 



B(6)M ff (6,c) = £ 



p 2k+b-cj 



iib k\(k + b-cy.(£-b-k)\(c-k)\ 

c—b<k<c 

I \ b+k / \ \ £-c+fc / / \ \ l—b—k / p-KC-k 



If we set now h = k + b — c and we eliminate fc, we get 
B(6)M ff (6,c)= ^ ' ! 



p 2h+c-b 



, ; , c (fc + c-6)IW(*-c-*)!(&-*)! 

0</i<6 

I \ c+/i / ^ \ t—b+h / / lw^-c-' 1 / p\ b - h 



= B(c)M ff (c,&) 



We obtain the same expression as before, but with b and c exchanged. 
Thus the matrix Mr is reversible with respect to B and 2? is the invariant 
probability measure of M.h- O 

When t grows, the law B concentrates exponentially fast in a neighborhood 
of its mean 

i 

l K = ]T*£(0 = 1(1 -1/k). 

1=0 

We estimate next the probability of the points at the left of £ K . 
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i ( £ \ b „„, e b 



Lemma 11.4 For b < t/2, we have 

< Bib) < 

AU J 

Proof. Let b < 1/2. Then 



(£\( ( A 1 (l-bf 1 / ^ \ & 1 

The upper bound on 6(6) is straightforward. □ 

The estimates of lemma 111.41 can be considerably enhanced. In the next 
lemma, we present the fundamental large deviation estimates for the bino- 
mial distribution. This is the simplest case of the famous Cramer theorem. 



Lemma 11.5 For p € [0, 1], we have 

lim -MB{[p£\) = -(l-p)ln( K (l-p))-pln 
Proof. We write 



K - 1 



]nB([p£\) = In *' ' ' ( * L ^ + 1} + (£ - [p£\ ) In 1 + [p£\ In (l - 1) 
= E ln ( 1 -7)-E ln 7 + (^-^J)ln- + ^Jln(l--). 

fc=0 fc=l 

We recognize Riemann sums for the functions ln(l — x) and In a;, thus 

1 C 1 — x 1 1 
lim -lnB(I^J) = / In dx + (1 - p) In - + pin (l ) . 

£->oo £ Jo X K K 

We conclude by performing the integration. □ 

The minimum of the rate function appearing in lemma fll.51 is £ K . The 
typical behavior of the Markov chain {Y n ) n >Q is the following. Starting 
from 1, it very quickly reaches a neighbor of its stable equilibrium £ K . 
Then it starts exploring the surrounding space by performing larger and 
larger excursions outside l K . Starting from £ K , the time needed to hit 
the point c € { 0, . . . , £ } is of order B(c)^ 1 . Once the process is close to 
£ K , it is unlikely to visit before time S(0) _1 = k . This is why the 
expected value of the hitting time of starting from 1 is of order k . In the 
next sections, we derive quantitative bounds on the behavior of the chain 
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{Y n ) n >o, starting from 1 or from I. We need only crude bounds, hence 
we use elementary techniques, namely, we compare the process with a sum 
of i.i.d. random variables and we use the classical Chebyshev inequality, 
as well as the exponential Chebyshev inequality. The resulting proofs are 
somehow clumsy, and better estimates could certainly be derived with more 
sophisticated tools. 

11.3 Falling to equilibrium from the left 

For b € { 0, . . . , I }, we define the hitting time r(6) of { 6, . . . , £ } by 

t(6) = inf { n > : Y n > b } . 

Our first goal is to estimate, for b smaller than £ K = £(1 — 1/k) and n > 1, 
the probability 

P(r(6) > n\Y = 0) . 
Rough bound on the drift. Suppose that r(b) > n. Then Y n _\ < b and 

b £ 

Yn > Y n -i - ^ 1 U n , k <p/K + X! 1 U n , k >l-p(l-l/K) ■ 
fe=l k=b+l 

Iterating this inequality, we see that, on the event { r(b) > n}, we have 
Y n > V n where 

n / b £ \ 

V n = ( ~ X] l Ut,k<p/* + X] 1 U t , k >l-p(l-l/K) j ■ 

t=l ^ k=l k=b+l ' 

Therefore 

P(r(b) > n\Y = 0) < P(V n <b). 

We shall bound P(V n < b) with the help of Chebyshev's inequality. Let us 
compute the mean and the variance of V n . Since V n is a sum of independent 
Bernoulli random variables, we have 

E(V n ) = n (-b^ + (£-b)p(l--)) = np(£ K -b), 

Var(K) - n(b £(l - + (/ - b)p (l - \) (l - p (l - 1))) 
< - 6)p) = n^p. 

We suppose that n is large enough so that 2b < E(V n ), that is, 

2b 

71 > p (e K - b) ' 
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By Chebyshev's inequality, we have then 

P(V n < b) = P(V n - E(V n ) < b - E(V n )) 

< P(\V n -E(V n )\ > ^E(V n )) 



< 4Var(K) < 4ntp 



We have thus proved the following estimate. 
Lemma 11.6 For n such that 



2b 

n > 



P {I, - b) ' 
we have 

4? 

P<T( " )>nlY ^ l,)< -^-^ 

We derive next a crude lower bound on the descent from to i K . This 
lower bound will be used to derive the upper bound on the discovery time. 



Proposition 11.7 We suppose that I — > +oo, q — > in such a way that 

Iq^r a e]0, +oo[. 
For I large enough and q small enough, we have 

^S'llW)* (l-j^)®"""-. 
Proof. We decompose 

P(r(4) < I Y Q = 0) > P(t(£ k - Int) < f, r(4) < P I Y = 0) 



^ ^ P(r(4-ln^)=i, K t = 6, r(4) < ^ | ^ = 0) 

t<£ 2 6>^-ln^ 

xP(r(£ K -]ne) = t, Y t = b\Y =0) 
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By the Markov property and the monotonicity of the process (Yn)n>o 7 we 
have, for t < £ 2 and b > £ K — \n£, 

P{t(£ k ) < e 2 | t(£ k -ln£) = t, Y t = b, Y a = 0) 

> P(r(£ K ) <£ 2 ~t\Y = b) > P(r{£ K ) < £ 2 - t \ Y Q = £ K - ln£) 
> P(Y 1 =£ K \Y =£ K -ln£) = M H (£ K - ln£, £ K ) . 

Reporting this inequality in the previous sum, we get 

P(r(£ K ) <£ 2 \Y = 0) > P(r(4 - ln£) < £ 2 \ Y = 0) M H {i K - £ K ) . 

By lemma 111.61 applied with b = £ K — In £ and n = £ 2 — 1 , we have for I 
large enough and q small enough 



P(t(£ k ~\u£)>£ 2 \Y =0) 



< 



a(lnf) 2 ■ 

Moreover, for I large enough and q small enough, 

9 (1-9)" > (9 e- 2a . 

Putting the previous inequalities together, we obtain the desired lower 
bound. □ 

We will need more information in order to derive the lower bound on the 
discovery time. We wish to control the time and speed at which the Markov 
chain (F„)„>o, starting from 1, reaches a neighborhood of its equilibrium 
£ K without visiting 0. This will require a stronger inequality than the one 
stated in lemma 111.61 this is the purpose of next lemma. 

Lemma 11.8 For n > 1, b e { 0, . . . , £ } and A > 0, we have 

P(r{b) > n\Y = 0) < cxp(\b + nb^(e x -l)+n(£-b)p^—^-(e- x -l)). 

Proof. We obtain this inequality as a consequence of Tchebytcheff 's 
exponential inequality. Indeed, we have 

P(r{b) >n|y o = 0) < P(V n <b) 

= P(-XV n > -Xb) = P(exp(-XV n ) > exp(-Ab)) 

< exp(Xb)E(exp(-XV n )) = exp(A6)(-B(exp(-AFi)) ) . 
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Yet 

b £ 

£(exp(-AVi)) = E^exp(x^2l Ulih<p / K - A V^i-pU-V"))) 

fe=l fe=6+l 

_ (1 + £ ( e> .!,)•(, -!,)». 

Thus 

P(t(6) >n|r o -0) < 

exp (A6 + nMn (l + -(e A - 1)) + n{£ - b) In (l + p^— ^(e~ A - 1))) . 

Using the inequality ln(l + 1) < t, we obtain the desired result. □ 

Wc derive next two kinds of estimates: first for the start of the fall, and 
second for the completion of the fall. 

Start of the fall. We show here that, after a time y/I, the Markov chain 
{Y n ) n >o is with high probability in the interval [lnf,£]. 

Proposition 11.9 We suppose that I — > +oo,q — > 0,£q — > a g]0,+oo[. 
For t large enough and q small enough, we have 

\ft>Vl P(Y t >ln£|y o = 0) > l-cxp(- i(ln^) 2 ) . 

Proof. We write, for t > y/I, 

P(Y t > \n£\Y = 0) > P(Y t > In 4 t(2\u£) < VI\Y = 0) 

= P(Y t >\nl,T(2\n£) = n,Y n = k\Y =0) 

n=l k=2lni 

= ]T P ( Yt - ln ^l T ( 21n ^) =n,Y n = k, F = 0)x 

71=1 k=2 In I 

P(r(2 ln£) = n, Y n = k \ Y Q = 0) . 

Now, for n < y/1 and k > 2hi£, by the Markov property, and thanks to 
the monotonicity of the process (Y t ) t >o, 

P(Y t >\n£\ r(2 In*) = n, Y n = k,Y = 0) 

= P(Y t > \n£\Y n = k) > P(Y t > \n£\Y n = 2ln£) 

= P(Y t _ n >ln£\Y = 2ln£). 
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For b < ln£, we have by lemmas [A . 1 1 and 1 1 1 . 4[ 

P(Y -b\Y -21ntf < m < B{ln£) < f ^MlX ni 



whence 

P(r 4 _„ > ln£| F = 21n£) > 1 - lnl ^ 

Reporting this inequality in the previous sum, we get 
P(Y t >\ne\Y = Q) > 

1 _ l n t ( ^Y^ ) n y{r(2ln£)<Vi\Y a =0). 

By lemma Hi .81 applied with A = In 2, n = y£, b = 2\a.£, for £ large enough 
and q small enough, 

P(r(21n£) > V~e\Y Q = 0) <cxp-^, 

whence 

P(r t >ln£|y o = 0) > fl-ln£f(l^V n£ Vl-exp ^ 



4 

> l-exp(-i(ln^) 2 

where the last inequality holds for £ large enough. □ 

Completion of the fall. We show here that, for e > 0, after a time 
4£/as, the Markov chain (Y n ) n >o is with high probability in the interval 

%{l-e)A- 

Proposition 11.10 We suppose that £ — > +oo , q — > , Iq — > a <e]0, +oo[. 
Let e <E]0,1[. There exists c(e) > such that, for £ large enough and q 
small enough, we have 

AO 

Vt>— P(Y t > 4(1 - e) | y = 0) > 1 - exp{-c(e)£) . 
as 
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Proof. Let t > U/(as). We write 

P{Y t > 4(1 ~s)\Y = 0) 

> p(Y t > 4(1 - e), r(4(l - e/2)) < — \ Y = 



as 

U/{ae) 

= P(Y t >t K (l-s)\T{£ K (l-s/2)) = n,Y n = k,Y = 0) 

n=l k>e K (l-s/2) 

x P(r(4(l - e/2)) = n, Y n = k \ Y = 0) . 
Now, for n < 4£/(as) and k > £ K (1 — e/2), by the Markov property, 

P(Y t > 4(1 - e) | r(4(l - e/2)) = n, Y n = k, Y a = 0) 

= P(Y t > 4(1 - e) | Y n = k) = P(F 4 _„ > 4(1 - e) | Y = k) 

< P{Y t -n > 4(1 -e)\Y = 4(1 - e/2)) . 

We have used the monotonicity of the process (Y t )t>o with respect to the 
starting point to get the last inequality. For b < 4(1 — e), we have by 
lemmas I A. II and 111.41 



6(4(1 -e/2)) " 6(4(1 -e/2)) ' 
whence 

P{Y t _ n > 4d - e) | r = 4d - e/2)) > 1 - 4(1 - ^fj^ • 
Thanks to the large deviation estimates of lemma 111.51 we have 

thus there exists c(e) > such that, for £ large enough 

P{Y t - n > 4(1 -s)\Y = 4(1 - e/2)) > 1 - exp(-c( £ )f) . 
Reporting this inequality in the previous sum, we get 

P(Y t >l K (l-e)\Y o = 0) > 

(l - exp(-c(e)£))p(r(4(l - e/2)) < ^ | Y Q = o) . 
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We apply lemma HOI with b = i K (l - e/2), A > and n = U/(ae): for t 
large enough and q small enough, 

p(r(^(i-|))>||y = o)< 

exp (A4(l-|) + ^(4(l-|)^(e^-l) + (^4(l-|))^(e--l))). 

We send I to oo and q to in such a way that Iq converges to a > 0. We 
obtain 

1 / e M \ 

limsup -]nP(r(^(l --))>— \Y =o) < 

e-Hx t \ 2 ae / 

k - 2^ + s ( (1 - 2^ - + (^T + 2)^-^ - V) ■ 

Expanding the last term as A goes to 0, we see that it is negative for A 
small enough, therefore there exists c'(e) > such that for £ large enough 
and q small enough, 

P(r(£ K (l -7}))> — \Y = 6) < exp(-c'( £ )f) . 
V 2 ae / 

Reporting in the previous inequality on Y t , we obtain that 

P(Y t > 4(1 - e) I Yo = 0) > (l - exp(-c( £ )f)) (l - exp(-c'(e)£)) 
and this yields the desired result. □ 

11.4 Falling to equilibrium from the right 

For b G { 0, . . . , I }, we define the hitting time 6(b) of { 0, ...,&} by 

6{b) = inf { n > : Y n < b } . 

Proposition 11.11 We suppose that I — > +00, q — >■ in such a way that 

£q^ a e]0, +00 [ . 
For ^ large enough and g small enough, we have 

P( 9( 4)<m. = *)>(i-;^)©' n V- 
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Proof. Our first goal is to estimate, for b larger than £ K = £(1 — l/«) 
and n > 1, the probability 

P(9(b) >n\Y = f). 

Suppose that 9(b) > n. Then Y n -\ > b and 

b i 

Y n < y n -\ - iu n , k < P /K + i[/„, fc >i-p(i-i/ K ) ■ 

fe=l fe=6+l 

Iterating this inequality, we see that, if F = £, on the event { 9(b) > n}, 
we have Y n < £ + V n , where 

n / b I \ 

V n = X] ( ~ X! l U t , k <p/ii + X! !l7 t , fc >l-p(l-l/K) J • 
t=l ^ fc=l fc=h+l ' 

Therefore 

P(6(b) > n\Y = £) < P(t + V n > b) . 

We shall bound P(£ + V n > b) with the help of Chebyshev's inequality. 
Let us compute the mean and the variance of V n . Since V n is a sum of 
independent Bernoulli random variables, we have 

E(V n ) = n(-bP+(£-b)p(l--)) = np(£ K -b), 

Var(K) = n(b £(l - Z) + (I - b)p (l - i) (l - p (l - i))) 
< n(bp + (£ — b)p) = n£p . 

We suppose that b — £ > np (£ K — b). By Chebyshev's inequality, we have 
then 

P(* + V n >b) = P(v n - E(V n ) >b-£-np(£ K - &)) 

< Var(K) 

" (b-£-np(£ K -b)) 2 ' 

We take n = £ 2 and 6 = lnf + £ K . Then, for € large enough, 

b-£-np(£ K -b) = ln£ + £ K - £ + £ 2 p\n£ ~ £ 2 pln£ > 0, 

whence, by the previous inequalities, for £ large enough and q small enough, 

1 



P(9(\n£ + £ K ) >£ 2 \Y a =£) < 



a(ln£) 



2 " 
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We decompose next 

P{0(t K ) <£ 2 \Y = £) > P(6(£ K + Inl) < £ 2 , 9(£ K ) <£ 2 \Y = £) 

= J2 E P(o(e K + \ne) = t, Y t = b, 6(£ K ) < £ 2 \Y = £) 
t<p b<e K +\ne 

= E E p (^-) ^ e i + ln£ ) = *> r * = 5 ' r ° = l ) 

t<P fc<C+ln£ 

xP(^(4 + ln£)=t, y t = 6|lo = £). 

By the Markov property and the monotonicity of the process (Y n ) n >o, we 
have, for t < £ 2 and b < £ K + ln£, 

P(0{£ K ) < £ 2 I 0{£ K + ln£) =t,Y t = b, Y = £) 

= P(0(£ K ) <£ 2 -t\Y = b) > P(6(£ K ) < £ 2 - 1 1 Y = £ K + ln£) 
> P{Y X =4|*o = 4 + hi = M H (£ K + ln£, £ K ) . 

Reporting this inequality in the previous sum, we get 

P(6{£ K ) <£ 2 \Y =£) > P(9(£ K + ln£) < £ 2 | Y = £) M H {£ K + In £,£ K ) . 

We have already proved that 

P(e(ln£ + £ K )>f\Y = £) < — Lj. 

Moreover, for £ large enough and q small enough, 

M H (£ K + \n£,£ K ) > (l^ n \l- q y > Q' n V 2a . 

Putting the previous inequalities together, we obtain the desired lower 
bound. □ 

We derive next a large deviation upper bound for the time needed to go 
from £ to 0. This will yield an upper bound on the discovery time. We 
define 

t = inf { n > : Y n = } . 



Proposition 11.12 For any a e]0,+oo[ 



lim sup - In E(t \Y a ~ £) < Inn. 

I— 5-co, q— 5-0 £ 
iq-^ra 
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Proof. We prove that, starting from £, the walker has probability of order 
to visit before time £ 2 . To do this, we decompose the trajectory 
until time £ 2 into two parts: the descent to the equilibrium £ K , which is 
very likely to occur, and the ascent to 0, which is very unlikely to occur. 
We estimate the probability of the ascent with the help of a beautiful 
technique developed by Schonmann |33j in a different context, namely the 
study of the metastability of the Ising model. More precisely, we use the 
reversibility of the process to relate the probability of an ascending path to 
the probability of a descending path. It turns out that the most likely way 
to go from i K to is obtained as the time-reverse of a typical path going 
from to £ K . Thanks to the monotonicity of the process, this estimate 
yields a lower bound on the hitting time of which is uniform with respect 
to the starting point. We bound then easily E{tq \Yq = £) by summing 
over intervals of length i 2 and using the Markov property. 

We should normally work with [£ K \ instead of l K . To alleviate the 
notation, we do as if £ K was an integer. We write 

P(r < 2£ 2 \Y = £) > P(9(£ K ) < £ 2 , r < 2t 2 \Y Q =i) 

= EE P ( 9 ^ =t ^= b > T o < 2f \Y = l) 

t<e 2 b<i K 

= £ p ( r ° ^ 2£2 1 d ^ = t > Y t = b > Y o = i) 

t<l 2 b<l K 

x P(9(£ K ) = t 1 Y t = b\Y = £) . 

By the Markov property and the monotonicity of the process (Yn)n>o 7 we 
have, for t < £ 2 and b < £ K , 

P(t < 2£ 2 | 9(£ K ) =t,Y t = b, Y a = £) 

> P(3ne {<,..., 2£ 2 } Y n = 0\6(£ K ) = t,Y t = b,Y = 0) 

= P(t < 2£ 2 - 1 1 Y = b) > P(t <£ 2 \Y = £ k ) . 

Reporting this inequality in the previous sum, we get 

P(r < 2£ 2 \Y =£) > P(9(£ K ) < £ 2 \ Y = £) P(r < £ 2 \Y = £ K ) . 

We estimate next the probability of the ascending part, i.e., the last proba- 
bility in the above formula. We start with the estimate of proposition [TTTTJ 

P«U<e\r.-o)>(i-°)(l)".-*-. 
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Yet 

P(t(£ k ) < e | Y = 0) = P(3t <£ 2 Y t >£ K \Y = 0) 

< ^P(3i<£ 2 Y t = b\Y = 0) 

b>e K 

From the last inequalities, we see that there exists b > £ K such that 



lnf 

e~ 2a 



Using the reversibility of Mr with respect to B (see proposition 1 11.3|) . we 
have 

6(6)P(T <£ 2 |r = fe) 

= E E B(b)M H (b, Vl ) ... Afe(y t _i,0) 

t<£2 y 1 ,...,y t _ 1 >0 

= E E B(0)M H (0,y t _i) ...M H { yi ,b) 
t<e 2 vi,---,vt-i>a 

= B(0)P(3t<f r t = 6|y =o). 

Thus 

P(ro<£ 2 \Y = b) = M-P(3t<e Y t =b\Y Q = 0) 



£ \ a(ln£) 2 J\K, 
By monotonicity of the process (Y t )t>Q, since b > then 
P(t <£ 2 \Y = £ k ) > P{t <£ 2 \Y = b). 
Using proposition 1 1 1 . lTI and the previous inequalities, we conclude that 

P( TO <^|y = < )>^((i-^)(|rV-) ! . 

Let e > 0. For £ large enough and q small enough, 

1 



P(r < 2£ 2 | y = £) > 
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Now, for n > 0, 

P(r > 2nf \Y = i) 

= 2 P ( r ° > 2nf2 ' y 2(n-i)^ - b, r > 2(n - IK 2 | F = I) 

b>l 

= S P ( To > 1 y 2(n-i)^ - 6, to > 2(n - 1)£ 2 , Y = £) 

b>l 

x P(Y 2(n _ 1)P = 6, r > 2(n - l)^ 2 | Y = £) . 
By the Markov property and the monotonicity of the process, we have 

P(t > 2r^ 2 | r 2(n _ 1)£ 2 = b, T > 2(n - l)^ 2 , F = 

= P(t > 2n£ 2 | Y 2(n _ 1)P = b) = P(t > 2^ 2 \Y Q = b) 

< P(r a >2f\Y =t) < 1--^. 
Reporting in the previous sum, we get 

P(t > 2r^ 2 \Y = i) < (l - ■^ T ^)P(r > 2(n - l)^ 2 | Y = l) . 
Iterating, we obtain 

P(To>2^ 2 |r =^) < (l-^TTi)) • 

Thus 

E(r I Y = I) = P(ro > n | Y = I) 

n>l 
2{n+l)l 2 

<J2 S p ( T o>t\ Y o = e) <J2 2f2p ( T o > 2n£2 \ Y o = t) 

n>0t=2ni 2 + l n>0 



n>0 



This bound is true for any e > 0. Sending successively t to oo and e to 0, 
we obtain the desired upper bound. □ 



11.5 Discovery time 

The dynamics of the processes (O e ) t >o, (O 1 ) t >o in N are the same as 
the original process (O t )t>o, therefore we can use the original process to 
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compute their corresponding discovery times. Letting 

T*' 1 = inf { t > : Of e W* } , r*- 1 = inf { t > : 0\ G W* } , 
t* = inf { t > : O t £ W* } , 

we have indeed 

£(r*< £ I O e = o e cxit ) = E(t* I Co = (0,0,0,..., m)) , 
^(t*' 1 | Ol = o^it) = #(r* I O = (0, m, 0, . . . , 0)) . 

In addition, the law of the discovery time t* is the same for the distance 
process and the occupancy process. With a slight abuse of notation, we let 

r* = inf { t > : D t e W* } . 

Notation. For b £ {0, ...,£}, we denote by (b) m the vector column whose 
components are all equal to b: 

(b) m = 



We have 

E(r* | O = (0, 0, . . . , 0, mj) = E(t* \ D = {l) m ) , 
E(t* I O = (0, m, 0, . . . , 0)) = E(t* \ D = (l) m ) . 

We will carry out the estimates of r* for the distance process (D n ) n >o. 
Notice that the case a = +oo is not covered by the result of next proposi- 
tion. This case will be handled separately, with the help of the intermediate 
inequality of corollarv lll.141 

Proposition 11.13 Let a e]0, +oo[ and a £ [0, +oo[. For any d E Af, 
lim - In E(t* I D = d) = In k . 

Proof. Since we are in the neutral case a = 1, then, by corollary |8.61 the 
distance process (-D„)„>o is monotone. Therefore, for any d G A/", we have 

E(t* I D = (l) m ) < E(t* \D = d) < E(t* \ D = (£) m ) . 

As in the section 111.21 we consider a Markov chain (Y n ) n >o with state 
space {0, ...,£} and having for transition matrix the lumped mutation 



79 



matrix Mh- We consider also a sequence (e„) n >i of i.i.d. Bernoulli random 
variables with parameter 1/m and we set 

Vn > 1 N(n) = £! + •••+£„. 

We suppose also that the processes (N(n)) n >i and (Y n ) n >o are indepen- 
dent. Let us look at the distance process at time n starting from (£) m . 
From proposition 111.11 we know that the law of the z— th chromosome in 
D n is the same as the law of Yjv(„) starting from I. The main difficulty 
is that, because of the replication events, the m chromosomes present at 
time n are not independent, nor are their genealogical lines. However, 
this dependence does not improve significantly the efficiency of the search 
mechanism, as long as the population is in the neutral space A/". To bound 
the discovery time r* from above, we consider the time needed for a single 
chromosome to discover the Master sequence w* , that is 

f = inf { n > : Y/v(„) = } 

and we observe that, if the master sequence has not been discovered until 
time n in the distance process, that is, 

Vt < n V« G {l,...,m} D t (i)>l, 

then certainly the ancestral line of any chromosome present at time n does 
not contain the master sequence. By proposition II 1.21 the ancestral line of 
any chromosome present at time n has the same law as 

^V(O); ^V(l): • • ■ ,YN(n) ■ 

From the previous observations, we conclude that 

Vn > P(t* >n\D = {l) m ) < P(t > n \ Y = t) . 
Summing this inequality over n > 0, we have 

E(t* I D = (£) m ) < E(t \Y =£). 

For n > 0, let 

T„ = inf { t > : N(t) = n } . 
The variables T n — T n —i, n > 1, have the same law, therefore 

Vn > E(T n ) = nE{T\) = nm. 

We will next express the upper bound on r* as a function of 

r = inf { n > : Y n = } . 
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We compute 



E(t \Y =£) = Y, P ^o>t\Y Q = e) 

t>i 

= 12 12 - *' r ° = n i y ° = £ ) 

t>l n>l 

= 1212 P ( T n>t)P(^ = n\Y = £) 

n>l t>l 

= E{T n ) P(t = n\Y =£) 

n>l 

= Y nmP(T Q = n\Y = £) = mE(T \Y = £) . 

n>l 

With the help of proposition 111.131 we conclude that 

limsup — In E(t* | Dq — d) < InK. 

£,771— >oo, q— fO 

In fact, we have derived the following upper bound on the discovery time. 

Corollary 11.14 Let To be the hitting time of for the process (Y n ) n >o- 
For any d £ Af, any m > 1, we have 

E( T *\D = d) < mE( TQ \Y =£). 

The harder part is to bound the discovery time r* from below. The main 
difficulty to obtain the adequate lower bound on r* is that the process 
starts very close to the master sequence, hence the probability of creat- 
ing quickly a master sequence is not very small. Our strategy consists in 
exhibiting a scenario in which the whole population is driven into a neigh- 
borhood of the equilibrium £ K . Once the whole population is close to £ K , 
the probability to create a master sequence in a short time is of order 1/k , 
thus it requires a time of order n e . The key point is to design a scenario 
whose probability is much larger than l//c. Indeed, the discovery time is 
bounded from below by the probability of the scenario multiplied by k . 
We rely on the following scenario. First we ensure that until time m£ 3 ^ 4 , no 
mutation can recreate the master sequence. This implies that r* > m£ 3 ^ 4 . 
Let us look at the population at time m£ 3 / 4 . Each chromosome present 
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at this time has undergone an evolution whose law is the same as the mu- 
tation dynamics studied in section 111.21 The initial drift of the mutation 
dynamics is quite violent, therefore at time m£ 3 ^ 4 , it is very unlikely that a 
chromosome is still in { 0, • • • ,ln£}. The problem is that the chromosomes 
are not independent. We take care of this problem with the help of the 
FKG inequality and an exponential estimate. Thus, at time mi 3 / 4 , in this 
scenario, all the chromosomes of the population are at distance larger than 
ln£ from the master sequence. We wait next until time ml 2 . Because of 
the mutation drift, a chromosome starting at \a£ has a very low probability 
of hitting before time m£ 2 . Thus the process is very unlikely to discover 
the master sequence before time m£ 2 . Arguing again as before, we obtain 
that, for any e > 0, at time m£ 2 , it is very unlikely that a particle evolving 
with the mutation dynamics is still in { 0, • • • ,£ K (1 — e) }. Thus, according 
to this scenario, we have r* > m£ 2 and 

V»G{l,...,m} D mP {i) > 4,(1 -e). 

Let us precise next the scenario and the corresponding estimates. We 
suppose that the distance process starts from (l) m and we will estimate 
the probability of a specific scenario leading to a discovery time close to k . 
Let £ be the event 

£ = { Vn < m£ 3/4 VI < In I U n ,i >p/n). 

If the event £ occurs, then, until time m£ 3 ^ 4 , none of the mutation events 
in the process (D n ) n > can create a master sequence. Indeed, on £ , 

Vb€{l, ...,£} Vn<m£ 3/4 

M H (b, £/■„,!, . . . , U n j) > M H {1, U n<1 , U n .i) 



> 



i +^2 i u n , l >i- P (i~i/K) > i 



1=2 

Thus, on the event £, we have r* > m£ 3 / 4 . The probability of £ is 

/ n \m.e 3/i ine 
P(£, = (!-£) 

Let e > 0. We suppose that the process starts from (l)™ 1 and we estimate 
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the probability 

P(t* >^ 1 -^) > P(t* > K £ ^\£) 

> p(yte{ mi 3 /*, k^ 1 -*) } D t eAf, e) 
= p ( Vi e { m£3/4 ' • • • > k£(1_£) } AeAT, L» mf 3/4 = d, £) 
> ^(vt G {mf 5 / 4 ,...,*^ 1 ^} D t eN\ D ml 3,i = d, £] 

d>(\n£) m 

X P{ D mim =d,S). 

Using the Markov property, we have 

P(yt e { m£ 3/4 7 . . . , n e{1 ~ 6) } D t €Af\ D mi s/4 = d, £ ) 

= P(yt e {0,...,K^ (1 - e) -mf 3 / 4 } A eJV| A) = d) 
= p(t* > K^ 1 "^ - m£ 3 / 4 | D = d) > P(r* > | D = d) . 

In the neutral case, by corollary 18.61 the distance process is monotone. 
Therefore, for d> (lnfp, 

P(t* > k^ 1 -") | U = d) > P(r* > K l{1 ~ e) | Do = (ln^) m ) . 

Reporting in the previous sum, we get 

P(t* > k'^) > 

P(t* > K e(1 -^ | D = (ln£) m ) P(D mi 3/4 > (ln£) m , £) . 

We first study the last term in the above inequality. The status of the 
process at time mi 3 / 4 is a function of the random vectors 

Rn = (S n , I n , Jn, U n s, U n<i ) , 1 < n < m£ 3/4 . 
We make an intermediate conditioning with respect to S ni I n , J n : 

P(D me3/i > (ln£) m , £) 

= E(p(D mlVi > (ln£)"\ S\S n ,I n ,J n ,l<n< ml 3 ' 4 ) 



The variables S n ,I n ,J n , 1 < n < m£ 3 ' 4 being fixed, the state of the process 
at time m£ 3 ' 4 is a non-decreasing function of the variables 

U nA ,...,U nii , \<n<m£ 3 ' 4 . 
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Indeed, the mutation map A4h(-, Ui, . . . , ui) is non-decreasing with respect 
to u\ . . . ,U£. Thus the events £ and { D me3 /4 > (ln£) m } are both non- 
decreasing with respect to these variables. By the FKG inequality for a 
product measure (see the end of the appendix) , we have 

P{D mP „ > (\n£) m , £ | S n ,I n , J n ,l<n< m£ 3 ' 4 ) > 

P(D mim > {\n£) m \S n ,I n ,J n ,l<n< m£ 3 ' 4 ) 

xP(£ \S n ,I n ,J n , l<n<m£ 3 ' 4 ). 

Yet £ does not depend on the variables S n , I n , J n , therefore 

P(£ | S n ,I n , J n , l<n< m£ 3 ' 4 ) = P{£) . 

Reporting in the conditioning, we obtain 

P{D ml3/i > (bxl) m , £) > 

E{p(D miVi > {ln£) m | S n ,I n ,J n , 1 < n < m£ 3 ' 4 ) P(£)) 

= P(D m(3/i > (ln^D P{£) . 

Next, 



P{D mim > {\n£) m ) = l-P(3ie{l,...,m} D roff/ *(i)<lm 

> 1- E P(D mi ^(i)<^)' 

l<i<m 

From proposition lll.il 

Vie{l,...,m} P(D mea/i (i) < In £) = P(Y N(mP /^ < hxl) , 
therefore 

P(D ml3/i > (\n£) m ) > 1 - mP(Y N{m£3/i) <hx£) , 

We estimate now the probability of the event { Y Nl ^ mt3 /^ < ln£}, or rather 
its complement. The random variable N(m£ 3 ^ 4 ) follows the binomial law 
with parameter ml 3 ! 4 and 1/m, therefore 



P 



(N(m£ 3/4 ) <Vi) < P^exp ( - N(m£ 3/4 )) > exp (-VI)) 



I n ml 3 ' 4 - 



< exp + 1 

\em m 



< exp 



(V£ + £ 3 / 4 (i-l)) < exp(7l-^ 3 / 4 ). 
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Combining the previous estimates and using proposition 1 1 1 .91 we have 
PpN{mt»l<) > ln ^) > P( Y *W") ^ ln£ ' N(m£ 3 / 4 ) > VI) 

+ 00 

> E p ( Y N( m ty*) > hit, N(m£ 3 / 4 ) = t) 
t=s/2 

= P ( Y t > lni ) P(N{m£ 3/i ) = t) 

> (l - exp ( - i(ln^) 2 )) (l - cxp (fi 

> l-exp(-i(ln^) 1 

the last inequality being valid for t large enough. Putting the previous 
estimates together, we have 

■.rnl 3 / 4 In e 



P(D m(3/ . > (ln£) m , £)>(l- mexp ( - t^) 2 )) (l - -J 
We study next 

We give first an estimate showing that a visit to becomes very unlikely 
if the starting point is far from 0. 

Lemma 11.15 For b E {1, . . . , £}, we have 

B(0) 



Vn > P(t* <n\D Q = (b) m ) < 



nm- 



B(b) 

Proof. Let n > and b G { 1, . . . , I }. We write 

P(r* <n\D = (b) m ) = 

P(3t<n 3te{l,...,m} D t (i) = | D = (b) m ) 

^ E E P(p t (i)=0\D = (b) m ) 

l<t<n l<i<m 

By proposition 1 1 1 . l] for any t > 0, any i£ { 1, . . . , m }, 

p(D t (%) = o | Co = (&)'") = p(yjv ( t) = o | y = 6) . 

Using proposition 111.31 and lemma IA.11 we have 

P(Y m =0\Y = b) < ||. 
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Putting together the previous inequalities, we get 

P(r*< n\D = (ft)™) < nml 

as requested. □ 
Let e' > 0. Now 



P 



(t* > K e{1 - £) \D Q = (ln<O m ) 

(t* > mf, D t G TV for m£ 2 < f < | L> = (ln^) m ) 



> P 



- V-Dt G W for m4 < t < D " ~ {inl > 

> ^ P^Dt £Af for m£ 2 <t < n^ 1 -^ \t* > mi 2 , D ml 2 = d 

d>(l K (l-e>))™ 

x p(t* > m£ 2 , D mP = d\ D = (lnf) 

Using the Markov property and the monotonicity of the process (D t )t>o, 
we have for d > (4(1 - e')) m , 

P(p t G M for ml 2 < t < n t{1 ~ e) | r* > ml 2 , D mi 2 = dj 

= P(yt G { 0, . . . , K^ 1 -^ - ml 2 } D t G N | D = d\ 
= P(r* > - ml 2 \D = dj > P(r* > K £(1 - £) \D =dj 

> P(r* >K i ^\Do^(£ K (l-e')) m ). 
Reporting in the previous sum, we get 

p(t* > K l{1 ~ e) | D = (lnf) m ) > P(t* > k^ 1 -^ | D = (4(1 - e')) m ) 
x p(t* > m£ 2 , D mi 2 > (4(1 - e')) m \ D Q = (ln£) m ) . 
We hrst take care of the last probability. We write 

p(t* > m£ 2 , D ml 2 > (4(1 - £'))" | D = (lnf) ro ) > 

P(p ml 2 > (4(1 - e')V | Do = (ln£) m ) - P(r* < me 2 \ D = (ln£) m ) . 

To control the last term, we use the inequality of lemma fl 1 . 1 51 with n = me 2 
and b = \n£: 

p( T * < m e 2 1 D = {\ney n ) < (me) 2 B ^ 
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By lemma [11.41 we have 



B{0) , f2bx£\ 1 ^ 



B{ln£) ~ \ £ J 
whence 

f 1 ) In f\ kit 

P(t* < ml 2 | A) = (In l) m ) < (mlf(^) . 
For the other term, we use the monotonicity of the process {D t )t>o to get 

P(D mi 2 > (4(1 - e')) m | D = (ln£) m ) 

> P(D mP > (4(1 - e')) m | D = (0) m ) 
= l-P(3tG{l,...,m} D mP {i)<£ K {l-e')\Do = (0) m ) 

> i - y, p ( D ™* 2 (*) < - e ') I D ° = (°r) ■ 

1 < i < rn 

From proposition 1 1 1 . ll for any z € {l,...,m}, 

p(D mP (i) < £ K {l-e') | Z? = (0) m ) = P(Y N{me) < i K {l-e') \ Y = o) , 
therefore 

p(A^ > (4(1 - e')) m | L>o = (ln£) m ) 

> 1 - mP(yjv K) < 4(1 - £') | Y = 0) . 

We estimate now the probability of the event { Y/v(m« 2 ) < 4(1 — e ') }: or 
rather its complement. The random variable N(m£ 2 ) follows the binomial 
law with parameter m£ 2 and 1/m, therefore 

p(N{ml 2 ) < -^) < P^exp ( - N{ml 2 )) > exp ( - -^)) 

, U , ( 1 , 1 \ mf2 
< exp — — + 1-- 

as \em m/ 

L ( I ovt^ / /? 2 



< exp — +4 --1 < exp — --^ 
\ae' y e 'J \ae' 2 
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Combining the previous estimates and using proposition 111. 101 we have 
p{Y N(me2) >£ K (l-e')\Y Q = o) 



> 

+ 00 



P(Y N(rn£ 2 ) > 4(1 - s'), N(m£ 2 ) > 1_ I Y = 0) 



> P ( Y N(mi*) > 4(1 - s'), N(m£ 2 ) = i I y = 0) 

t=4e/(ae') 



+00 



= P{Yt>e K (l-e')\Y = 0)P(N{me 2 )=t) 

t=At/(ae') 

> (l-exp(-c(e'K))(l-exp(^-V)) 

> 1 - exp ( - ^c(e')£) , 

where c(e') > and the last inequality is valid for £ large enough. Putting 
together the previous estimates, we obtain 



p(t* > ml 2 , D m£ 2 > (4(1 - e')) m \ D = (In 4") > 

1 /21n/\ln^ 
l-mexp(-±c(e>)£) - {m£) 2 {^f) 

It remains to study P(t* > \ D = (4(1 - e')) m ) . We use the 

inequality of lemma Hi. 151 with n = k^ 1-6 ' and b = 4(1 — 

P(t* < I D = (4(1 - e')) m ) < K^-^m, B{0) 



B(4(l-e')) 



For e' small enough, using the large deviation estimates of lemma lll.5[ we 
see that there exists c(e,e') > such that, for £ large enough, 

P{t* < I Do = (4(1 - e'))™) < ex P 4c( £ , e 'K) • 

Collecting all the previous estimates, we conclude that, for £ large enough, 

1, ( p\m£ 3/4 ln£ 



p(t* > I D = (l) m ) > (l-mcxp(--(ln^) 2 )) (l-£ 

/ \ / 1 / 9 In £\ 1" A 

x(l-exp4c( e , e 'K))^l-mexp(-- C 4K) - M0 2 ( — ) J- 
Moreover, by Markov's inequality, 

eCt* I Do = (l) m ) > ^ (1 ~ e) p(t* > I D = (l) m ) • 
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It follows that 



liminf \ In e(t* I D = (l) m ) >(l-e)ln«;. 

lq-+a, nt^a 

Letting e go to yields the desired lower bound. □ 
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12 Synthesis 

As in theorem 13. 11 we suppose that 

£ — > +oo , m — > +oo , q — » , 

in such a way that 

^^ae]0,+oo[, — -> a G [0, +oo] . 

We put now together the estimates of sections[TU]and[Tl]in order to evaluate 
the formula for the invariant measure obtained at the end of section | 
For 6 = 1,1, we rewrite this formula as 



E(r \ 


zl = 1) 


E(T*\O 6 = ol it )+E(r \ 


A = i) 



Ll H (0^E(r n \Z<> = l) f fay® 



.M h (9,O)E(t o \Z0 
By proposition 110.31 

1 m 

I™ f ; r-r- 1 — g r + l) YVf— V(*) = /(p*( a )) 

f, m ^oo Vm h (0,O)-E(t o \Z® = 1) J^ J \mJ w 

By corollary fTTHl 

I fP*(a.) 

lim — ln^(ro|^o =1 ) = / ln0( e - a ,O,s)ds. 

7l->00 TO Jo 



q— >0, iq-^a 

By proposition II 1.131 for a € [0, +00 [, 

lim - In E(t* I = °Lit) = ln«. 

Iqr+a, ^->a 

For the case a = +00, by corollary II 1 . 141 and proposition 1 1 1 . 1 2l 
lin, \\n(-E(T*\O e =ol it )) < hiK. 



t ,m->oo, <j->0 ^ \TO 

These estimates allow to evaluate the ratio between the discovery time and 
the persistence time. We define a function (f> : ]0, +oo[— !• [0, +00] by setting 
4>(a) = if a > lncr and 

/•p*( a ) 

Va<lncr 0(a) = / ln0(e~ a , 0, s) ds . 
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We have then, for a 6 [0, +00 [ ora = +00, 

r E(t I Zj = l) f if a 0(a) < ln« 

< ; m^g^o £(r* I O e Q = o^t) ~ 1 +00 if a 4>{a) > In k 

Notice that the result is the same for 6 = 1 and 9 = 1. Putting together 
the bounds on v given in section 19.41 and the previous considerations, we 
conclude that 

f f if a 4>{a) < In k 

hm / / dv = < . . 
«,m-K>o, g ->o J[ 01 ] \f\P*V a )) ifa0(a)>lnft 



~T~ 



This is valid for any continuous non-decreasing function / : [0, 1] — > K 
such that /(0) = 0. To obtain the statement of theorem 15.11 it remains to 
compute the integral. For a < In a, 



rP W 

/ ln0(e- a ,O,s)ds 
Jo 



p * (a \ ae- a {l-s) 

m ; r as 

cr(l-e" a )s + (l-s) 

<t(1 - e- a ) In g ^ ~ - ^ + Hae' a ) 
= (l- CT (l-e-)) 

and we are done. 
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A Appendix on Markov chains 



In this appendix, we recall classical definitions and results from the theory 
of Markov chains with hnite state space. The goal is to clarify the objects 
involved in the definition of the model, and to state the fundamental general 
results used in the proofs. This material can be found in any reference book 
on Markov chains, for instance jB], [T7], [21] • The definitions and results on 
monotonicity, coupling and the FKG inequality are exposed in the books 
of Liggett [25] and Grimmett [19] . 

Construction of continuous time Markov processes. The most con- 
venient way to define a continuous time process is to give its infinitesimal 
generator. The infinitesimal generator of a Markov process (X t )t>o with 
values in a finite state space £ is the linear operator L acting on the func- 
tions from £ to R defined as follows. For any function <j> : £ — > R, any 
x E £, 

L<j>(x) = lim ±(E(<f>(X t )\X Q =x)- 0(a:)) . 

It turns out that the law of the process (Xt)t>o is entirely determined by the 
generator L. Therefore all the probabilistic results on the process (Xt)t>o 
can in principle be derived working only with its infinitesimal generator. 

In the case where the state space of the process is finite, the situation 
is quite simple and it is possible to provide direct constructions of a pro- 
cess having a specific infinitesimal generator. These constructions are not 
unique, but they provide more insight into the dynamics. Suppose that the 
generator L is given by 

Vx e £ L(j>(x) = ^2 c(x, y) (<p{y) - (j>(x)) . 
yes 

The evolution of a process (X t )t>o having L as infinitesimal generator can 
loosely be described as follows. Suppose that X t = x. Let 

c ( x ) = ^2 C ( X 'V)- 

Let t be a random variable whose law is exponential with parameter c(x) : 

Vs > P(t >s)= exp(-c(x)s) . 

The process waits at x until time t + T. At time t + r, it jumps to a state 
y x chosen according to the following law: 

p/ y \ c(x,y) 
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The same scheme is then applied starting from y. In this construction, the 
waiting times r and the jumps are all independent. 

Construction of discrete time Markov chains. To build a discrete 
time Markov chain, we need only to define its transition mechanism. When 
the state space £ is finite, this amounts to giving its transition matrix 

(p(x,y), x,ye£). 

The only requirement on p is that it is a stochastic matrix, i.e., it satisfies 

Vx, y e £ < p(x,y) < 1 , 
Vx e £ ^2p(x,y) = 1. 

In the sequel, we consider a discrete time Markov chain (X t )t>o with values 
in a finite state space £ and with transition matrix (p(x,y)) x , ye £. 

Invariant probability measure. If the Markov chain is irreducible and 
aperiodic, then it admits a unique invariant probability measure /i, i.e., the 
set of equations 

v(y) = ^ y ) ' y e £ ' 

admits a unique solution. The Markov chain (X t )t>o is said to be reversible 
with respect to a probability measure v if it satisfies the detailed balanced 
conditions: 

Vx,ye£ v{x)p(x,y) = u{y)p{y,x). 

If the Markov chain (X t )t>o is reversible with respect to a probability 
measure v, then v is an invariant probability measure for (X t )t>o- In 
case (X t )t>o is in addition irreducible and aperiodic, then v is the unique 
invariant probability measure of the chain. 

Lemma A.l Suppose that [i is an invariant probability measure for the 
Markov chain (X t )t>o- We have then 

Vx,ye£ Vt>0 n(x)P(X t = y\X = x) < fi(y) . 

Proof. The proof is done by induction on t. For t = 0, we have 

P(X = y\X Q = x) =0 if y^x, 
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and the result holds. Suppose it has been proved until time t € N. We 
have then, for x, y € £ , 

MO) P(X t+ i = y\X =x) =J2 fa) P(Xt+i = y, X t = z | X = x) 
= P(Xt =z\X =x) P(X t+1 =y\X t =z) 

< ^2n(z)p(z,y) = fi(y) 

zes 

and the claim is proved at time t + 1 . □ 

We state next the ergodic theorem for Markov chains. We consider only 
the case where the state space £ is finite. 

Theorem A. 2 Suppose that the Markov chain (X t )t>o is irreducible ape- 
riodic. Let fi be its invariant probability measure. For any initial distribu- 
tion hq, for any function / : £ — > R, we have, with probability one, 

lim \f f(X s )ds= f f(x)dn(x). 

Lumping. The basic lumping result for Markov chains can be found in 
section 6.3 of the book of Kemeny and Snell [22] . Let (E\, . . . ,E r ) be a 
partition of £ . Let f : £ — >{l,...,r}be the function defined by 

{1 if xeEi 
: : 
r if x e E r 

The Markov chain (X t )t>o is said to be lumpable with respect to the par- 
tition (Ei, . . . , E r ) if, for every initial distribution fiQ of Xo, the process 
[f (Xt)) t>Q is a Markov chain on { 1, . . . , r } whose transition probabilities 
do not depend on fiQ. 

Theorem A. 3 (Lumping theorem) A necessary and sufficient condi- 
tion for the Markov chain (X t )t>o to be lumpable with respect to the 
partition (Ex, . . . , E r ) is that, 

Vi,J € {!>..., r} Vx,y e E z ^p(x,z)= ^p(y,z). 

zeEj z£Ej 

Suppose that this condition holds. For i, j € { 1, . . . ,r }, let us denote by 
Ps(i,j) the common value of the above sums. The process (/PQ)) t>0 is 
then a Markov chain with transition matrix (pe(z, j))i<i,j<v 
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Monotonicity. We recall some standard definitions concerning mono- 
tonicity for stochastic processes. A classical reference is Liggett's book 
[2"5] . especially for applications to particle systems. We consider a discrete 
time Markov chain (X t ) t > with values in a space £. We suppose that the 
state space £ is finite and that it is equipped with a partial order <. A 
function / : £ —> R is non-decreasing if 

Vx,y€£ x<y => f(x)<f(y). 

The Markov chain (Xt)t>o is said to be monotone if, for any non-decreasing 
function /, the function 

xe£^E(f(X t )\X = x) 

is non-decreasing. 

Coupling. A natural way to prove monotonicity is to construct an ad- 
equate coupling. A coupling is a family of processes (Xf) t >o indexed by 
x € £ , which are all defined on the same probability space, and such that, 
for x € £, the process (Xf)t>o is the Markov chain starting from Xo = x. 
The coupling is said to be monotone if 

Wx, y G £ x < y =>> Vf > 1 X? <Xf . 

If there exists a monotone coupling, then the Markov chain is monotone. 

FKG inequality. We consider the product space [0,1]™ equipped with the 
product order. Let \i be a probability measure on [0, 1] and let us denote by 
H® n the product probability measure on [0, 1]™ whose marginals are equal 
to /i. The Harris inequality, or the FKG inequality in this context, says 
that, for any non-decreasing functions /, g : [0, 1]™ —> R, we have 

/ f9d»® n > [ fdn® n I gdn® n . 

J[o,x] n «/[0,l] n •'[0,1]" 

The case of Bernoulli product measures is exposed in section 2.2 of Grim- 
mett's book [19]. 
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